In this document you will find indications on how to use various
R libraries and functions for estimating true disease
prevalence and accuracy of diagnostic tests in a Bayesian framework,
along with various exercises. We suppose that you do have some basic
R skills. If you have not worked with R before
or feel a bit rusty, here are some resources to help you to prepare:
Chapters 1, 2, and 4 of the CodeCademy “Learn R” course will provide a good overview of the basic concepts required for this workshop.
If you are familiar with R and want to do some
further reading, Hadley Wickham’s “R
for Data Science” is a great resource.
Remember, there are often many different ways to conduct a given
piece of work in R. Throughout this document, we tried to
stick with the simpler approaches (e.g., the shortest code, the minimal
number of R libraries).
Throughout the document, you will find examples of R
code along with comments. The R code used always
appear in grey boxes (see the following example). This is the
code that you will be able to use for your own analyses. Lines
that are preceded by a # sign are comments, they are skipped
when R reads them. Following each grey box with
R code, a white box with results from the analysis
is presented.
For instance, this is a R code where I am simply asking
to show main descriptive statistics for the speed variable of
the cars dataset (note that this dataset is already part of
R).
#This is a comment. R will ignore this line
#The summary() function can be use to ask for a summary of various R objects. For a variable (here the variable speed), it shows descriptive statistics.
summary(cars$speed)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.0 12.0 15.0 15.4 19.0 25.0
Throughout the document, in the text we will use:
- Italics for datasets or variables. For instance, the speed
variable of the dataset cars.
- Shaded boxes for R libraries (e.g.,
episensr) and functions (e.g., summary()).
In R we can first call a given library and then use the
functions related to this library or we can type the name of the library
followed by :: and then the function. For instance the two
following pieces of code are equivalent:
library(ggplot2)
ggplot(data=cars, mapping=(aes(x=speed)))+
geom_histogram()
##OR##
ggplot2::ggplot(data=cars, mapping=(aes(x=speed)))+
geom_histogram()
The latter may improve reproducibility, but at the expense of longer codes. Throughout the document, we will always first call the library and then run the functions to keep codes short.
One last thing, when using a given function, it is not mandatory to
name all the arguments, as long as they are presented in the sequence
expected by this function. For instance, the ggplot()
function that we used in the previous chunk of code is expecting to see
first a dataset (data=) and then a mapping attribute
(mapping=) and, within that mapping attribute a x variable
(x=). We could shorten the code by omitting all of these.
The two following pieces of code are, therefore, equivalent:
library(ggplot2)
ggplot(data=cars, mapping=(aes(x=speed)))+
geom_histogram()
##OR##
library(ggplot2)
ggplot(cars, (aes(speed)))+
geom_histogram()
Throughout the document, however, we will use the longer code with all the arguments being named. Since you are learning these new functions, it would be quite a challenge to use the shorter code right from the start. But you could certainly adopt the shorter codes later on.
LET’S GET STARTED!
When using latent class models (LCM) to estimate disease prevalence or diagnostic test accuracy, we will need to use various distributions. There are two situations where we will use distributions:
Some distributions will more often be used for the first objective, others will mainly be used for the latter objective. In the following sections, we will cover the most useful distributions.
One of the first things that you will need for any Bayesian analysis is a way to generate and visualize a prior distribution corresponding to some scientific knowledge that you have about an unknown parameter. Generally, you will use information such as the mean and variance or the mode and the 2.5th (or 97.5th) percentile to find a corresponding distribution. Here we present three of the most important distributions. We will give more details on each of them in the following sections:
- Normal distribution: defined by a mean
(mu) and its standard deviation (SD) or variance
(tau). In some R packages and software,
rjags and JAGS for instance, the inverse of the variance
(1/tau) is used to specify a given normal distribution.
Notation: dNorm(mu, 1/tau)
- Uniform distribution: defined by a minimum
(min) and maximum (max).
Notation: dUnif(min, max)
- Beta distribution: bounded between 0 and 1, beta
distributions are defined by two shape parameters (a) and
(b).
Notation: dBeta(a, b)
The dnorm() function can be used to generate a given
Normal distribution and the curve() function can be used to
visualize the generated distribution. These functions are already part
of R, there is no need to upload a R
package.
curve(dnorm(x, mean=2.0, sd=0.5), # I indicate mean and SD of the distribution
from=-2, to=7, # I indicate limits for the plot
main="Normal distribution with mean of 2.0 and SD of 0.5", #Adding a title
xlab = "Value", ylab = "Density") #Adding titles for axes
Density curve of a Normal distribution.
Note that a Normal distribution with mean of zero and a very large SD provides very little information. Such distribution would be referred to as uniform or flat distribution (A.K.A.; a vague distribution).
curve(dnorm(x, mean=0.0, sd=10000000000),
from=-100, to=100,
main="A flat Normal distribution",
xlab = "Value", ylab = "Density")
Density curve of a flat Normal distribution.
In the same manner, we could visualize a uniform distribution using
the dunif() function. In the following example, we assumed
that any values between -5.0 and 5.0 are equally probable.
curve(dunif(x, min=-5.0, max=5.0),
from=-10, to=10,
main="Uniform distribution with -5.0 and 5.0 limits",
xlab = "Value", ylab = "Density")
Density curve of a Uniform distribution.
Beta distributions are another type of distributions that will
specifically be used for parameters that are proportions (i.e., bounded
between 0.0 and 1.0). For this specific workshop, they will be very
handy, since sensitivity, specificity, and prevalence are all
proportions . The epi.betabuster() function from the
epiR library can be used to define a given prior
distribution based on previous knowledge. When you use the
epi.betabuster() function, it creates a new R
object containing different elements. Among these, one element will be
named shape1 and another shape2. These correspond to
the a and b shape parameters of the corresponding Beta
distribution.
For instance, we may know that the most likely value for the
sensitivity of a given test is 0.85 and that we are 97.5% certain that
it is greater than 0.75. With these values, we will be able to find the
a and b shape parameters of the corresponding Beta distribution.
Within the epi.betabuster() function, the
greaterthan= argument will be important to indicate whether
the next argument x=0.75 describe the 2.5th
(greaterthan=T) or the 97.5th (greaterthan=F)
percentile.
library(epiR)
# Sensitivity of a test as Mode=0.85, and we are 97.5% sure >0.75
rval <- epi.betabuster(mode=0.85, conf=0.975, greaterthan=T, x=0.75) # I create a new object named rval
rval$shape1 #View the a shape parameter in rval
## [1] 62.661
rval$shape2 #View the b shape parameter in rval
## [1] 11.88135
#plot the prior distribution
curve(dbeta(x, shape1=rval$shape1, shape2=rval$shape2), from=0, to=1,
main="Prior for test's sensitivity", xlab = "Proportion", ylab = "Density")
Density curve of a Beta distribution for a test sensitivity.
Note that a dBeta(1.0, 1.0) is a uniform beta distribution.
#plot the prior distribution
curve(dbeta(x, shape1=1.0, shape2=1.0), from=0, to=1,
main="A Beta(1.0, 1.0) or flat distribution", xlab = "Proportion", ylab = "Density")
Density curve of a Beta(1.0, 1.0) distribution.
In many situations, a distribution will be used as a part of the likelihood function to link observed data to unknown parameters. The ones we will most frequently use are:
- Binomial distribution: For variables that can take
the value 0 or 1. Binomial distributions are defined by the probability
that the variable takes the value 1, (P), and a number of
“trials”, (n).
Notation: dBin(P, n)
- Multinomial distribution: made up of a number of
probabilities, all bounded between 0 and 1, and that, together, will sum
up to 1. Multinomial distributions are defined by k probabilities
(P1, P2, …, Pk) and the number of
observations (n).
Notation: dmulti(P[1:k], n)
If we have a variable that can take only two values, healthy or diseased (0 or 1), we can estimate an unknown parameter such as the proportion (P) of diseased individuals (i.e., the disease prevalence), based on the observed data (a number of positive individuals, T AND a total number of individuals, n). For instance, if we had 30 diseased (T = 30) out of 100 tested individuals (n=100) we could estimate the unknown parameter P using this very simple likelihood function:
\(T \sim dbin(P, n)\)
A last type of distribution that we will use in our LCM is the multinomial distribution. When an outcome is categorical with >2 categories, we could use a multinomial distribution to describe the probability that an individual has the value “A”, or “B”, or “C”, etc. In our context, the main application of this distribution will be for describing the combined results of two (or more than two) diagnostic tests. For instance, if we cross-tabulate the results of Test A and Test B, we have four potential outcomes:
-Test A+ and Test B+ (lets call n1 the number of individuals
with that combination of tests results and P1 the probability
of falling in that cell)
-Test A+ and Test B- (n2 and P2)
-Test A- and Test B+ (n3 and P3)
-Test A- and Test B- (n4 and P4)
We can illustrate this as follow:
| Test A+ | Test A- | |
|---|---|---|
| Test B+ | n1 | n3 |
| Test B- | n2 | n4 |
Thus we could describe the probabilities (P1 to P4) of an individual falling into one of these cells of the 2x2 table as a multinomial distribution. In this specific case, we would say that the combined results of the 2 tests (n1 to n4) and the total number of individual tested (n), which are the observed data, are linked to the 4 probabilities (P1 to P4; the unknown parameters) as follows:
\(n[1:4] \sim dmulti(P[1:4], n)\)
Which means: the value of n1 (or n2, n3, or n4) is determined by the probability of falling into the “Test A+ and Test B+” cell, which is P1 (or P2, P3, or P4 for the other cells), and by the total number of individuals tested (n). Nothing too surprising here… If I have a probability of 0.30 to fall in a given cell, and I have tested 100 individuals, I should find 30 individuals in that cell. Wow! Still, the multinomial distribution is nice because it will ensure that all our probabilities (P1 to P4) will sum up to exactly 1.0.
Different software and R packages can be used to run
Bayesian models. Whether it is a LCM or a more conventional regression
model is not very important at this stage. In this document we will
mainly use a software called JAGS and the R2jags package.
So, if not already done, you will have to download JAGS here and
install it on your computer. Once installed, we will be able to
use R2jags to operate JAGS through R.
Basically, to run any Bayesian model, we will always need the same three things:
This is the easy part. If we were to run, for instance, a
“conventional” regression analysis, we would have to import in
R a complete dataset (e.g., a CSV file) with a row for each
observation and a column for each variable. However, when using LCM to
estimate a disease prevalence or diagnostic accuracy, the dataset is
simply made up of the number of individuals in each category. For
instance, to estimate a simple prevalence (Prev), we could have
30 individuals that tested positive (variable T) out of a total
of 100 individuals (variable n). In such case, the dataset
could be created as follow:
#Creating a dataset
datalist <- list(T=30, n=100)
If you want to check that it works, you could ask to see this new
R object.
datalist
## $T
## [1] 30
##
## $n
## [1] 100
Let’s start with a simple example, estimating a single proportion. For estimating a proportion we could describe the likelihood function linking unknown parameters to observed data as follows:
\(T \sim dbin(Prev, n)\)
In other words: the disease prevalence (Prev), an unknown parameter, can be linked to the observed number of positive individuals (T) and the total number of tested individual (n) through the binomial function.
In this case, the values of T and n are the observed data and Prev is the only unknown parameter.
We will have to specify a prior distribution for each unknown parameter (using the \(\sim\) sign).
In the preceding example, we only have one unkown parameter,
Prev, which is a proportion. Theoretically such parameter could
take values from 0 to 1, so a beta distribution would be one way to
describe its prior distribution. We could use the
epi.betabuster() function from the epiR
library to define a given beta distribution based on previous knowledge,
as described in the preceding sections. Moreover, if we want to use a
flat prior, we could simply choose the value 1.0 for the a and b shape
parameters.
In the following code, I am creating R objects for a and
b for Prev beta prior distributions.
Prev.shapea <- 1 #a shape parameter for Prev
Prev.shapeb <- 1 #b shape parameter for Prev
With informative priors, for instance, if we know from the literature that the prevalence is expected to be 0.25 with 95CI of (0.20, 0.30), we could instead use the following code and assign these values to Prev.shapea and Prev.shapeb:
library(epiR)
# Prevalence as Mode=0.25, and we are 97.5% sure >0.20
rval <- epi.betabuster(mode=0.25, conf=0.975, greaterthan=T, x=0.20) # I create a new object named rval
rval$shape1 #View the a shape parameter in rval
## [1] 62.356
rval$shape2 #View the b shape parameter in rval
## [1] 185.068
#I left them as comments, but the two lines below would be used to assign the generated values for later usage
#Prev.shapea <- rval$shape1
#Prev.shapeb <- rval$shape2
Our Bayesian model has to be presented to JAGS as a text file (i.e.,
a .txt document). To create this text file, we will use the
paste0() function to write down the model and then the
write.table() function to save it as a .txt file.
An interesting feature of the paste0() function
is the possibility to include R objects in the
text. For instance, you may want to have a generic model that
will not be modified, but within which you would like to change the
values used to describe the priors. You could include these “external”
values Within your paste0() function. Thus, you just have
to change the value of the R object that your model is
using, rather than rewriting a new model. For instance, we just created
these two R objects called Prev.shapea and
Prev.shapeb (which were the a and b shape parameters for the
prior distribution of our proportion Prev). To describe our
prior distribution, we can ignore this previous work that we have done
and directly create a short text as follows:
text1 <- paste0("Prev ~ dbeta(1, 1)") #Creating a short text named "text1"
text1 #Asking to see the text1 object
## [1] "Prev ~ dbeta(1, 1)"
However, if the same model is to be applied later with a different
prior distribution, it may be more convenient to leave the a
and b shape parameters as R objects. Thus, if we
change the prior distribution and run the script again, you would run
the entire analyses with the newly created distribution.
text2 <- paste0("Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,")") #Creating a short text named "text2" and containing other already created R objects
text2 #Asking to see the text2 object
## [1] "Prev ~ dbeta(1, 1)"
When using paste0(), any text appearing outside of sets
of quotation marks (e.g., Prev.shapea and Prev.shapeb
in the preceding script) will be considered as R objects
that need to be included in the text. Thus, later, I could just change
the values for Prev.shapea and Prev.shapeb and my
model will be automatically modified.
Some important notes about the text file containing the model
before we start with R2jags:
JAGS is expecting to see the word “model” followed by curly brackets { }. The curly brackets will contain the likelihood function AND the priors.
Similarly to R, any line starting with a # will be
ignored (these are comments).
Similarly to R, <- means equal.
The ~ symbol means “follows a distribution”.
Similarly to R, JAGS is case sensitive (i.e.,
P does not equal p).
Loops can be used with keyword “for” followed by curly brackets.
Array can be indicated using squared brackets [ ].
In the example below, I create an R object called
model_single_prop using the paste0() function and
then save it as a text file using the write.table()
function. We will check this text file below and then explain the code
for the model.
# Specify the model
model_single_prop <- paste0("model{
#=== LIKELIHOOD ===#
T ~ dbin(Prev, n)
#=== PRIOR ===#
Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,") ## Prior for Prev
}")
#write as a text (.txt) file
write.table(model_single_prop,
file="model_single_prop.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
Here is a snapshot of the final result (i.e., the text file):
Text file with the model for estimating a single proportion.
In the preceding script, the paste0() function was used
to create the text, as an R object, that will be needed for
the .txt file. This text is mainly “static”, except for the a
and b parameters of the beta distribution used to describe our
prior knowledge on disease prevalence. You can see that, if we were to
modify and then run the preceding script where we assigned values to
these parameters, and then run again the paste0() function,
our text file would be updated with new values for our prior
distribution. This is a feature that we will use a lot, especially with
more complex LCM, to minimize the amount of work.
The write.table() function was simply used to record as
a .txt file the R object containing the text we just
created.
Both the paste0() and write.table()
functions are part of basic R, no need to install or call a
given library.
The initial value is the first value where each Markov chain will
start.
- You have to specify one initial value for each unknown parameter in
the model.
- If you decide to run three Markov chains, you will need three
different sets of initial values.
- You can also let the software choose a random value to start each
chain, but it will then be impossible to exactly
reproduce a given result. For instance, if you have obtained a very
ackward Markov chain, you may not be able to reproduce this result to
investigate what went wrong. Note, though, that it would be quite
uncommon for a Markov chain to be “sensitive” to the chosen starting
value. But you will see during the workshop that it will sometimes
happen.
In our preceding example we have one unknown parameter (Prev). If we were to run three Markov chains, we could generate three sets of initial values for this parameter as follows. Of course, the chosen values could be changed. In the case of a proportion, you can specify any value between 0 and 1.
#Initializing values for the parameter "Prev" for the 3 chains
inits <- list(
list(Prev=0.10),
list(Prev=0.50),
list(Prev=0.90)
)
We know have a R object called inits that
contains the three values where each Markov chains will start.
We now have the three elements that we needed: the data, the model,
and a set of initial values. We can now use the R2jags
library to run our Bayesian analysis. The R2jags package
provides an interface from R to the JAGS software for
Bayesian data analysis. JAGS uses Markov Chain Monte Carlo
(MCMC) simulation to generate a sample from the
posterior distribution of the parameters.
The R2jags function that we will use for that purpose is
the jags() function. Here are the arguments that we need to
specify:
data=model.file=parameters.to.save=n.chains=. The
default is 3inits=. If inits is
NULL, JAGS will randomly generate valuesn.iter=n.burnin=n.thin=1 in most situationsDIC=TRUE or DIC=FALSE
will determine whether to compute the deviance, pD, and deviance
information criteria (DIC). These could sometimes be
useful for model selection/comparison. We could leave that to
DIC=FALSE since we will not be using these values during
the workshopWith the following code, we will create a R object
called bug.out. This object will contain the results of running
our model with the specified priors using the available data, and the
sets of initial values that we have created. For this analysis, we have
asked to run 3 Markov chains for 6000 iterations each and we discarded
the first 1000 iterations of each chain. Finally, we will monitor our
only unknown parameter: Prev and nothing else.
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist, #Specifying the R object containing the data
model.file="model_single_prop.txt", #Specifying the .txt file containing the model
parameters.to.save=c("Prev"), #Specifying the parameters to monitor. When >1 parameter, it will be a list, ex: c("Prev", "Se", "Sp")
n.chains=3, #Number of chains
inits=inits, #Specifying the R object containing the initial values
n.iter=6000, #Specifying the number of iterations
n.burnin=1000, #Specifying the number of iterations to remove at the begining
n.thin=1, #Specifying the thinning rate
DIC=FALSE) #Indicating that we do not request the deviance, pD, nor DIC
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 1
## Unobserved stochastic nodes: 1
## Total graph size: 4
##
## Initializing model
As you can see, we received a few messages. It looks good, don’t you think?
Before jumping to the results, you should first check if:
- The chains have converged,
- The burn-in period was long enough,
- You have a large enough sample of values to describe the posterior
distribution with sufficient precision,
- The Markov chains behaved well (mainly their auto-correlation).
There are many diagnostic methods available for Bayesian analyses, but, for this workshop, we will mainly use basic methods relying on plots.
Diagnostics are available when you convert your model output into a
MCMC object using the as.mcmc() function of the
mcmcplots library.
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out) #Converting the model output into a MCMC object (for diagnostic plots)
You can then use the mcmcplot() function on this newly
created MCMC object.
mcmcplot(bug.mcmc, title="Diagnostic plots") #Asking for the diagnostic plots
When you do that, a HTML document will be created and your web browser will automatically open it. Here is a snapshot of the HTML file in my browser:
Figure. Diagnostic plots.
To check whether convergence of the Markov chains was achieved you can inspect the “trace plot” (the bottom figure) presenting the values that were picked by each chain on each iteration. The different chains that you used (three in the preceding example) will be overlaid on the trace plot. You should inspect the trace plot to evaluate whether the different chains all converged in the same area. If so, after a number of iterations they should be moving in a relatively restricted area and this area should be the same for all chains.
The plots above would be a good example of the trace plot of three “nice” Markov chains (lucky us!). In this preceding example, we can see on the trace plot that the chains had already converged (probably way before the end of the first 1000 iterations that were excluded) toward the same area. On the trace plot, we can only see the values that were stored after the burn-in period. The chains were then moving freely within this limited area (thus sampling from the posterior distribution).
We could conclude based on the trace plot that:
- The Markov chains did converge
- A short burn-in period of less than a 1000 iterations would have been
sufficient. Note that we will often use a longer burn-in (e.g., 5,000 or
even more) just to be sure and because it only costs a few more seconds
on your computer…
Here is another example where two chains (the red and green)
converged in the same area, but a third one (the blue) also converged,
but in a different area. We should be worried in such
case. We will see later the reason for such behavior.
Finally, here is a last example with just one Markov chain which
clearly did not converge, even after 25,000 iterations. You
should be terrified by that!!! ;-).
Next, you can inspect the posterior distributions to determine if you have a large enough sample of independent values to describe with enough precision the median estimate and the 2.5th and 97.5th percentiles. The effective sample size (EES) takes into account the number of values generated by the Markov chains AND whether these values were auto-correlated, to compute the number of “effective” independent values that are used to describe the posterior distribution. Chains that are auto-correlated will generate a smaller number of “effective values” than chains that are not auto-correlated.
The effectiveSize() function of the coda
library provide a way to appraise whether you add enough iterations. In
the current example, we already created a mcmc object named
bug.mcmc. We can ask for the effective sample size as
follow:
effectiveSize(bug.mcmc) #Asking for the ESS
## Prev
## 8724.061
So we see here that, despite having stored 3 times 5,000 values (15,000 values), we have the equivalent of almost 9,000 independent values. What to do with this information? Well, you could, for instance, decide on an arbitrary rule for ESS (say 10,000 values) and then adapt the length of the chains to achieve such an ESS (for each of the selected parameters, because the effective sample sizes will not be the same for all parameters).
Remember, a feature of Markov chains is to have some auto-correlation between two immediately subsequent iterations and then a correlation that quickly goes down to zero (i.e., they have a short memory). The auto-correlation plot can be used to assess if this feature is respected.
From our previous example (check the upper right figure in the plot
below), it seems that it is the case.
Here is another example were we do have an auto-correlation that is
persisting.
When such a behavior is observed, running the chains for more iterations will help achieve the desire ESS. In some situations, specifying different priors (i.e., increasing the amount of information in priors) may also help.
Now, if you are happy with the behavior of your Markov chains, the
number of iterations, and burn-in period, you can summarize your results
(i.e., report the median, 2.5th and 97.5th percentiles) very easily with
the print() function. Below, I have also indicated
digits.summary=3 to adjust the precision of the estimates
that are reported (the default is 1).
print(bug.out, #Asking to see the mean, median, etc of the unknown parameters that were monitored
digits.summary=3) #Requiring a precision of 3 digits
## Inference for Bugs model at "model_single_prop.txt", fit using jags,
## 3 chains, each with 6000 iterations (first 1000 discarded)
## n.sims = 15000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## Prev 0.304 0.045 0.219 0.273 0.303 0.334 0.397 1.001 15000
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
Here you see that the median Prev estimate (95CI) was 30% (22, 40).
You can also ask for plotting the bug.mcmc object that you
have created using the plot() function. It would produce
the trace plot and a density plot presenting the posterior distributions
of each unknown parameter.
plot(bug.mcmc) #Asking for trace plot and plots of prior and posterior distributions
Trace plot and density plot produced using the plot() function.
If you prefer to work on your own plots, note that all the values sampled from your posterior distributions were stored in the bug.out object that you have created. If you inspect the bug.out object, you will notice an element named BUGSoutput, inside this element, there is another object named sims.list, within this sims.list, there is an element named Prev. As you can see below, this element is made of a list of 15,000 values. This correspond to the 5,000 values sampled (1/iteration) by each Markov chain (3 chains) for the unknown parameter Prev. So this element is simply all the sampled values assembled together.
sims.list object located in the bug.out output.
You could use this element, for instance, to plot together the prior and posterior distributions on a same figure. On the figure below, the prior distribution is not obvious, it is the flat dashed red line just above the x axis (remember, we used a flat prior for Prev).
#Plot the posterior distribution
plot(density(x=bug.out$BUGSoutput$sims.list$Prev), #Indicating variable to plot
main="Disease prevalence", #Title for the plot
xlab="Prevalence", ylab="Density", #x and y axis titles
)
#plot the prior distribution
curve(dbeta(x, shape1=Prev.shapea, shape2=Prev.shapeb), from=0, to=1, #Plotting the corresponding prior distribution
lty=2, #Asking for a dashed line pattern
col="red", #Asking for a red line
add=TRUE) #Asking to add this plot to the preceding one
Prior (dashed red line) and posterior (full black line) distribution of the prevalence of disease.
The data used for the following exercises came from a study aiming at estimating diagnostic accuracy of a milk pregnancy associated glycoprotein (PAG) ELISA and transrectal ultrasonographic (US) exam when used for determining pregnancy status of individual dairy cows 28 to 45 days post-insemination. In that study, the new test under investigation was the PAG test and US was used for comparison, but was considered as an imperfect reference test.
In the original study, data from 519 cows from 18 commercial dairy herds were used. For the following exercises, the dataset was modified so that we have data from 519 cows from 3 herds (i.e., the data from 16 herds were collapsed together so they appear to be from one large herd). Note that there are some cows with missing values for one of the tests, so we will not always work with 519 cows. The complete original publication can be found here: Dufour et al., 2017.
The dataset Preg.xlsx contains the results from the study. However, for the exercises we will always simply use the cross-tabulated data and these will be already organized for you and presented at the beginning of each exercise. The dataset is still provided so you can further explore additional comparisons. The list of variables in the Preg.xlsx dataset are described in the table below.
Table. List of variables in the Preg.xlsx dataset.
| Variable | Description | Range |
|---|---|---|
| Obs | An observation unique ID number | 1 to 519 |
| Herd | Herd ID number | 1 to 3 |
| Cow | Cow ID number | NA |
| T1_DOP | Number of days since insemination at testing | 28 to 45 |
| PAG_DX | Results from the PAG test | 0 (not pregnant); 1 (pregnant) |
| US_DX | Results from the ultrasound test | 0 (not pregnant); 1 (pregnant) |
| TRE_DX | Results from the transrectal exam (TRE) test | 0 (not pregnant); 1 (pregnant) |
We will use data from one population (herd #1). In this herd, we had the following proportion of apparently pregnant cows (based on the US exam).
139 US+/262 exams
Open up the R project for Exercise 1 (i.e., the R
project file named Exercise 1.Rproj).
In the Exercise 1 folder, you will find partially completed
R scripts. To answer the questions, try to complete the
missing parts (they are highlighted by a #TO COMPLETE#
comment). We also provided complete R scripts, but try to
work it out on your own first!
1. Start from the partially completed Question 1.R script located in Exercise 1 folder. What would be the proportion of pregnant cows? Use a Bayesian approach to compute that proportion and a credible interval (CI). For this first computation, assume that you have no prior knowledge on the expected proportion of pregnant cows in a Canadian herd. Run three Markov chains with different sets of initial values. Look at the trace plot. Do you think convergence was achieved? Do you need a longer burn-in period? Are all 3 chains converging in the same space? Compute the effective sample size (ESS), do you feel that number of iterations was sufficient to describe the posterior distribution with enough precision? What about auto-correlation of the Markov chains, any issues there?
2. If you were to compute a Frequentist estimate with 95% CI, would it differ a lot from your Bayesian estimates? Why? As a refresher, the formula for a Frequentist 95%CI for a proportion is below (where P is the actual observed proportion and n is the denominator for that proportion):
\(95CI=P \pm 1.96* \sqrt{\frac{P*(1-P)}{n}}\)
3. Start from the R script that you completed in
Question 1. In the literature, you saw a recent study conducted
on 30 Canadian herds and reporting an expected pregnancy prevalence of
42% with 97.5% percentile of 74%. What kind of distribution could you
use to represent this information? Use these information as a pregnancy
prevalence prior distribution. Are your results very different?
4. Start from the partially completed Question 4.R script located in Exercise 1 folder. When comparing PAG to US results in herd #1, you got the following 2x2 table.
Table. Cross-classified results of the PAG and US tests in herd #1.
| US+ | US- | Total | |
|---|---|---|---|
| PAG+ | 138 | 10 | 148 |
| PAG- | 1 | 113 | 114 |
| Total | 139 | 123 | 262 |
Assuming that US is a gold-standard test could you compute PAG sensitivity (Se) and specificity (Sp)? Use vague priors for PAG Se and Sp since it is the first ever study on this topic (i.e., you do not have any prior knowledge on these unknown parameters).
1. What would be the proportion of pregnant cows? Use a Bayesian approach to compute that proportion and a credible interval (CI). For this first computation, assume that you have no prior knowledge on the expected proportion of pregnant cows in a Canadian herd. Run three Markov chains with different sets of initial values. Look at the trace plot. Do you think convergence was achieved? Do you need a longer burn-in period? Are all 3 chains converging in the same space? Compute the effective sample size (ESS), do you feel that number of iterations was sufficient to describe the posterior distribution with enough precision? What about auto-correlation of the Markov chains, any issues there?
Answer: I chose to run a model with 3 chains of 10,000 iterations each (11,000 iterations minus a burn-in of 1,000). I have obtained the following diagnostic plots:
Diagnostic plot
Convergence of the 3 chains was achieved, all three chains appear to be moving in the same space (see trace plot). The 1,000 iterations burn-in period is probably more than needed for this very simple problem. Autocorrelation plot is just perfect with correlation declining very rapidly close to zero at lag of 1! The effective sample size for the Prev parameter is >18,000 values. So, plenty of precision to report the median and 2.5th and 97.5th percentiles.
The median pregnancy prevalence estimate (95% CI) was 53.0% (46.9, 59.0).
2. If you were to compute a Frequentist estimate with 95% CI, would it differ a lot from your Bayesian estimates? Why?
Answer: It should not differ much from the Bayesian median estimate and 95CI because these latter estimates were generated using vague priors. In such cases, Bayesian and Frequentist estimates should be quite similar. Actually, if we use the Frequentist formula for computing 95CI and the observed data (i.e., 257/497) we get:
\(P = 139/262 = 0.531\)
\(95CI=0.531 \pm 1.96* \sqrt{\frac{0.531*(1-0.531)}{262}} = 0.531 \pm 0.060\)
Thus, we have a Frequentist estimated proportion of 53.1% with a Frequentist 95 CI of 47.1 to 59.1 (virtually unchanged compared to the Bayesian estimates).
3. In the literature, you saw a recent study conducted on 30 Canadian herds and reporting an expected pregnancy prevalence of 42% with 97.5% percentile of 74%. What kind of distribution could you use to represent this information? Use these information as a pregnancy prevalence prior distribution. Are your results very different?
Answer: A beta(4.2, 5.4) distribution would have a mode of 0.42 and a 97.5th percentile of 0.74. Using this information as prior, I get a prevalence of pregnancy of 52.7% (95 CI: 46.8, 58.6). Actually, we observe very little difference between the models using vague vs. informative priors. This is because this informative prior contains only the equivalent of 10 observations \((4.2+5.4)\). In the dataset, there are 262 cows. Therefore, the estimation process is still mainly driven by our dataset.
4. Assuming that US is a gold-standard test could you compute PAG sensitivity (Se) and specificity (Sp)? Use vague priors for PAG Se and Sp since it is the first ever study on this topic (i.e., you do not have any prior knowledge on these unknown parameters).
Answer: The sensitivity and specificity are simple proportions. Sensitivity is simply the number of test positive among the number of truly diseased individuals. We have data for these in the 2x2 table. If we assume that US is a gold-standard test, then the number of truly diseased cows is 139. Of them, 138 tested positive to the PAG test. Similarly, the specificity is simply the number of test negative among the number of truly healthy individuals. From the 2x2 table, the number of true negative cows was 123 and 113 of them tested negative to the PAG test. The likelihood functions (one for each parameter in this case) linking the unknown parameters (Se and Sp) to these observed data would be:
\(Test+ \sim dbin(Se, True+)\)
\(Test- \sim dbin(Sp, True-)\)
Using vague beta(1, 1) priors on the Se and Sp, I got an estimated Se of 98.8% (95 CI: 96.1, 99.8) and a Sp of 91.5% (95CI: 85.9, 95.5). But be cautious with these numbers. We know very well that an US exam is not a perfect diagnostic test for pregnancy in dairy cows. With the current approach, we are attributing all disagreements between tests to a failure of the PAG test… But it could be the US test that was wrong in many instances. Moreover, animals for which the two tests agreed could still be misdiagnosed as pregnant or open, but by both tests (i.e., a failure of the PAG AND the US tests). We wil see in the next sections of the course how to account for the fact that neither of the tests are gold-standards.
When using two imperfect diagnostic tests on the same population, we will have to use a Bayesian latent class model (LCM) to estimate the true disease prevalence, and the sensitivity and specificity of both diagnostic tests. Two different approaches could be used. The simplest approach would be to consider that the two diagnostic tests are independent conditionally on the true health status of the individuals. Conditional independence can be defined as:
Two tests are conditionally independent when the sensitivity (or specificity) of one test does not depend on whether the results of another test are positive or negative among infected (or non-infected) individuals. Cheung et al., 2021
In many situations, however, we will use two diagnostic tests that make use of the same biological process. For instance, an ELISA test could measure antibodies against a given pathogen in blood and another test could also measure antibodies, but in milk. In such case, it would be difficult to pretend that:
The sensitivity of the milk test does not depend on whether the results of the blood test are positive or negative among infected individuals.
In such case, it would be better to consider a LCM where conditional dependence between tests is possible. Note that, even for tests that make use of very different biological processes, it is rather difficult (or, actually, often impossible) to “demonstrate” conditional independence between them. So, if doable, allowing for a potential conditional dependence between tests would be a more robust strategy.
When two tests are conditionally independent, then obtaining a given combination of results to Test A and Test B, as function of the true disease prevalence (P) and of the accuracy of each test (SeA, SpA, SeB, SpB) can be described as illustrated below:
Illustration of the different potential results when applying two conditionally independent tests in a population.
From that illustration, we can deduce that the probability of testing positive to both tests is the sum of the first and fifth rows:
\(P(TestA+ TestB+) = P*SeA*SeB + (1-P)*(1-SpA)*(1-SpB)\)
Or, putted in words, it is the probability of being a truly diseased individual and of being identified as such by both tests (\(P*SeA*SeB\)), plus the probability of being a healthy individual and being misdiagnosed as diseased by both tests (\((1-P)*(1-SpA)*(1-SpB)\)). If we extend this reasoning to all tests combination possibilities, we get:
\(P(TestA+TestB+) = P*SeA*SeB +
(1-P)*(1-SpA)*(1-SpB)\)
\(P(TestA+TestB-) = P*SeA*(1-SeB) +
(1-P)*(1-SpA)*SpB\)
\(P(TestA-TestB+) = P*(1-SeA)*SeB +
(1-P)*SpA*(1-SpB)\)
\(P(TestA-TestB-) = P*(1-SeA)*(1-SeB) +
(1-P)*SpA*SpB\)
When applying two diagnostic tests on a given population, the data generated can be presented very simply as the number of individuals in each cell of the 2x2 table presenting the cross-classified results of the two tests:
| Test A+ | Test A- | |
|---|---|---|
| Test B+ | n1 | n3 |
| Test B- | n2 | n4 |
Thus, the likelihood function that could link the observed data (n1, n2, n3, n4) to the unknown parameters (P, SeA, SeB, SpA, SpB) could be defined as follows, using a multinomial distribution:
\(n[1:4] \sim dmulti(P[1:4], n)\)
\(P1 = P*SeA*SeB +
(1-P)*(1-SpA)*(1-SpB)\)
\(P2 = P*SeA*(1-SeB) +
(1-P)*(1-SpA)*SpB\)
\(P3 = P*(1-SeA)*SeB +
(1-P)*SpA*(1-SpB)\)
\(P4 = P*(1-SeA)*(1-SeB) +
(1-P)*SpA*SpB\)
Again, if we put that in words, the number of individuals in cell n1 (i.e., TestA+ and TestB+) is determined by the number of individuals tested (n) and the probability (P1) of falling in that cell. The latter itself depends on the true prevalence of disease in the population (P), and on each of the two tests sensitivity (SeA and SeB) and specificity (SpA and SpB). We can use similar reasoning for n2, n3, and n4. Finally, the multinomial distribution would further impose that the four probabilities (P1, P2, P3, and P4) have to sum up to 1.0 (i.e., 100%).
In such a model, we will have five unknown parameters to estimate (P, SeA, SeB, SpA, SpB), but only four equations. Thus, the model will be “non-identifiable”, and we will need to provide informative priors on, at least, two parameters. Often, we will provide priors on the Se and Sp of one of the tests, and/or on disease prevalence.
To run such a model, we simply need to provide a dataset where n1, n2, n3, and n4 are listed (in that order).
#n is of the form : (TestA pos and TestB pos), (TestA pos and TestB neg), (TestA neg and TestB pos), then (TestA neg and TestB neg)
datalist <- list(n=c(138,1,10,113))
Moreover, we will need the following information to generate the
a and b shape parameters of the various distributions.
These values will be included in the text file presenting the JAGS
model. Note that we have made use below of the paste0()
function to produce a model in which you could modify the tests
“labels”. You could thus use the JAGS model that will be described next
as a generic model for various diagnostic tests, and still obtain
“meaningful” outputs.
#I could first create labels for TestA and TestB
TestA <- "US"
TestB <- "PAG"
#Provide information for the prior distributions (all beta distributions) for the 5 unknown parameters
#Below, I have used informative priors for Prev, and Se and Sp of Test A. The values used are not seldom important at this stage.
Prev.shapea <- 4.2 #a shape parameter for Prev
Prev.shapeb <- 5.4 #b shape parameter for Prev
Se.TestA.shapea <- 131 #a shape parameter for Se of TestA
Se.TestA.shapeb <- 15 #b shape parameter for Se of TestA
Sp.TestA.shapea <- 100 #a shape parameter for Sp of TestA
Sp.TestA.shapeb <- 6 #b shape parameter for Sp of TestA
Se.TestB.shapea <- 1 #a shape parameter for Se of TestB
Se.TestB.shapeb <- 1 #b shape parameter for Se of TestB
Sp.TestB.shapea <- 1 #a shape parameter for Sp of TestB
Sp.TestB.shapeb <- 1 #b shape parameter for Sp of TestB
#I will also need the total number of individuals tested (n)
n <- sapply(datalist, sum)
With that, we have everything that is needed to write the JAGS model.
#Create the JAGS text file
model_2tests_1pop_indep <- paste0("model{
#=== LIKELIHOOD ===#
n[1:4] ~ dmulti(p[1:4], ",n,")
p[1] <- Prev*Se_", TestA, "*Se_", TestB, " + (1-Prev)*(1-Sp_", TestA, ")*(1-Sp_", TestB, ")
p[2] <- Prev*Se_", TestA, "*(1-Se_", TestB, ") + (1-Prev)*(1-Sp_", TestA, ")*Sp_", TestB, "
p[3] <- Prev*(1-Se_", TestA, ")*Se_", TestB, " + (1-Prev)*Sp_", TestA, "*(1-Sp_", TestB, ")
p[4] <- Prev*(1-Se_", TestA, ")*(1-Se_", TestB, ") + (1-Prev)*Sp_", TestA, "*Sp_", TestB, "
#=== PRIOR ===#
Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,") ## Prior for Prev
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
}")
#write as a text (.txt) file
write.table(model_2tests_1pop_indep,
file="model_2tests_1pop_indep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
With this code, you could simply modify the labels for Test A and Test B, and the shape parameters for the prior distributions and the text file with the JAGS model will automatically be updated. Currently, it looks like this:
Text file with the model for 2 independent diagnostic tests in 1 single population.
We will also need to provide a list of initial values (one per Markov chain) for all unknown parameters.
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG=0.90,
Sp_PAG=0.90),
list(Prev=0.70,
Se_US=0.10,
Sp_US=0.10,
Se_PAG=0.10,
Sp_PAG=0.10),
list(Prev=0.50,
Se_US=0.50,
Sp_US=0.50,
Se_PAG=0.50,
Sp_PAG=0.50)
)
Now we can run the model using jags() function.
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_2tests_1pop_indep.txt",
parameters.to.save=c("Prev", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG"),
n.chains=3,
inits=inits,
n.iter=11000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 1
## Unobserved stochastic nodes: 5
## Total graph size: 32
##
## Initializing model
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (results not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
Or we could ask to plot the prior and posterior distributions of one (or many) of the parameters. Here, we asked for the prior and posterior distributions of US sensitivity.
#Plot the posterior distribution
plot(density(x=bug.out$BUGSoutput$sims.list$Se_US),
main="US sensitivity",
xlab="Sensitivity", ylab="Density",
)
#plot the prior distribution
curve(dbeta(x, shape1=Se.TestA.shapea, shape2=Se.TestA.shapeb), from=0, to=1,
lty=2,
col="red",
add=TRUE)
Prior (dashed red line) and posterior (full black line) distribution of US sensitivity.
We will not go into the exact details, but Dendukuri and Joseph (2001) developed a highly generalisable model structure that can account for conditional dependence between diagnostic tests. To allow for conditional dependence between tests, the likelihood function would be modified as follows, by adding covariance terms representing the conditional dependence between tests:
\(n[1:4] \sim dmulti(P[1:4], n)\)
\(P1 = P*(SeA*SeB+covp) +
(1-P)*((1-SpA)*(1-SpB)+covn)\)
\(P2 = P*(SeA*(1-SeB)-covp) +
(1-P)*((1-SpA)*SpB-covn)\)
\(P3 = P*((1-SeA)*SeB-covp) +
(1-P)*(SpA*(1-SpB)-covn)\)
\(P3 = P*((1-SeA)*(1-SeB)+covp) +
(1-P)*(SpA*SpB+covn)\)
Thus, a small quantity (covp) is added to the sensitivities of the tests when they agree on a positive individual, and that same quantity is substracted when the tests disagree. covp thus represents the positive covariance between tests. Similarly, covn represents the negative covariance between tests. The effect of these covariances, if greater than zero, will be to penalize our estimates of the tests sensitivities and specificities to account for their conditional dependence.
Moreover, Dendukuri and Joseph (2001) proposed constraints on these covariance terms based on the inferred test sensitivities and specificities. This latter aspect allows to specify “natural limits” for the covariance terms prior distributions. Briefly, they demonstrated that:
Thus, we can use these as the minimum and maximum for covp and covn. For instance, we could use uniform distributions with these limits to describe the prior distributions for covp and covn. Our LCM would thus become:
#Create the JAGS text file
model_2tests_1pop_dep <- paste0("model{
#=== LIKELIHOOD ===#
n[1:4] ~ dmulti(p[1:4], ",n,")
p[1] <- Prev*(Se_", TestA, "*Se_", TestB, " + covp) + (1-Prev)*((1-Sp_", TestA, ")*(1-Sp_", TestB, ") + covn)
p[2] <- Prev*(Se_", TestA, "*(1-Se_", TestB, ") - covp) + (1-Prev)*((1-Sp_", TestA, ")*Sp_", TestB, " - covn)
p[3] <- Prev*((1-Se_", TestA, ")*Se_", TestB, " - covp) + (1-Prev)*(Sp_", TestA, "*(1-Sp_", TestB, ") - covn)
p[4] <- Prev*((1-Se_", TestA, ")*(1-Se_", TestB, ") + covp) + (1-Prev)*(Sp_", TestA, "*Sp_", TestB, " + covn)
#=== PRIOR ===#
Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,") ## Prior for Prev
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
#=== CONDITIONAL DEPENDENCE STRUCTURE ===#
covp ~ dunif(minp,maxp)
covn ~ dunif(minn,maxn)
minp <- (1-Se_", TestA, ")*(Se_", TestB, "-1)
minn <- (Sp_", TestA, "-1)*(1-Sp_", TestB, ")
maxp <- min(Se_", TestA, ",Se_", TestB, ") - Se_", TestA, "*Se_", TestB, "
maxn <- min(Sp_", TestA, ",Sp_", TestB, ") - Sp_", TestA, "*Sp_", TestB, "
}")
#write as a text (.txt) file
write.table(model_2tests_1pop_dep,
file="model_2tests_1pop_dep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
Again, with this code, you could simply modify the labels for Test A and Test B, and the shape parameters for the prior distributions and the text file with the JAGS model will automatically be updated. Currently, it looks like this:
Text file with the model for 2 conditionally dependent diagnostic tests in 1 single population.
If we do not modify the other priors nor the dataset used previously
we could run this model using the jags() function. However,
we now have to add initial values for covp and covn.
Also, note that we have added covp and covn to the
list of parameters that we would like to monitor and we have increased
substantially the number of iterations. The model allowing for
conditional dependence between tests tends to be more difficult to
solve. We will thus need a substantial number of iterations to get a
decent ESS.
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG=0.90,
Sp_PAG=0.90,
covp=0,
covn=0),
list(Prev=0.70,
Se_US=0.10,
Sp_US=0.10,
Se_PAG=0.10,
Sp_PAG=0.10,
covp=0.01,
covn=0.01),
list(Prev=0.50,
Se_US=0.50,
Sp_US=0.50,
Se_PAG=0.50,
Sp_PAG=0.50,
covp=0.05,
covn=0.05)
)
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_2tests_1pop_dep.txt",
parameters.to.save=c("Prev", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG", "covp", "covn"),
n.chains=3,
inits=inits,
n.iter=101000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 1
## Unobserved stochastic nodes: 7
## Total graph size: 58
##
## Initializing model
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (the diagnostic plots are not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
effectiveSize(bug.mcmc)
## Prev Se_PAG Se_US Sp_PAG Sp_US covn covp
## 17536.534 1951.888 5991.089 7274.054 7367.546 7165.264 1952.859
print(bug.out, digits.summary=3)
## Inference for Bugs model at "model_2tests_1pop_dep.txt", fit using jags,
## 3 chains, each with 101000 iterations (first 1000 discarded)
## n.sims = 300000 iterations saved
## mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
## Prev 0.554 0.038 0.480 0.529 0.554 0.580 0.630 1.001 14000
## Se_PAG 0.951 0.035 0.875 0.927 0.956 0.980 0.998 1.001 13000
## Se_US 0.906 0.022 0.857 0.893 0.908 0.922 0.944 1.001 12000
## Sp_PAG 0.924 0.040 0.836 0.898 0.928 0.954 0.990 1.001 7700
## Sp_US 0.950 0.021 0.901 0.938 0.953 0.965 0.982 1.002 3200
## covn 0.033 0.020 0.001 0.018 0.031 0.045 0.077 1.002 2600
## covp 0.036 0.028 0.000 0.011 0.031 0.056 0.097 1.001 14000
##
## For each parameter, n.eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
The ESS is still just around 2,000 for some parameters (e.g., Se_PAG), if it was for publication, we would possibly let it run for more iterations.
Open up the R project for Exercise 2 (i.e., the R
project file named Exercise 2.Rproj).
In the Exercise 2 folder, you will find partially completed
R scripts. To answer the questions, try to complete the
missing parts (they are highlighted by a #TO COMPLETE#
comment). We also provided complete R scripts, but try to
work it out on your own first!
1. Start from the partially completed Question 1.R script located in Exercise 2 folder. We could try to estimate the Se and Sp of the PAG-based ELISA test when compared to US as the reference test. We could, however, explicitly acknowledge that the US exam is not a gold-standard test. At first, we could assume that these tests are conditionally independent. We will use data from one population (herd #1). In herd #1, the cross-tabulated PAG by US results were as follows:
Table. Cross-classified results of the PAG and US tests in herd #1.
| US+ | US- | Total | |
|---|---|---|---|
| PAG+ | 138 | 10 | 148 |
| PAG- | 1 | 113 | 114 |
| Total | 139 | 13 | 262 |
How many unknown parameters do you have to estimate and how many degrees of freedom do you have? Therefore, how many informative priors will you need? Estimate PAG’s Se and Sp, and true pregnancy prevalence using informative priors on US’ Se and Sp, and on pregnancy’s prevalence. Compare your results to those of Exercise 1 (question 4) where you estimated PAG’s Se and Sp, but assuming that US was a gold standard test.
Some literature to help you with Informative
priors:
In 2006, Romano
et al. reported, for a US exam conducted between 28-45 days
post-insemination:
-Se = 0.90 (95CI: 0.85, 0.95)
-Sp = 0.95 (95CI: 0.90, 0.97)
Moreover, in a recent study conducted on 30 Canadian herds they reported an expected pregnancy prevalence of 42% with 97.5% percentile of 74%.
2. Start from the R script that you completed in
Question 1. Now run the same model, but use vague priors on all
unknown parameters. Do you think it will work? What do you observe
regarding the Markov chains?
3. Start from the partially completed Question 3.R script located in Exercise 2 folder. Now, we could consider that some conditional dependency exist between the two tests. For instance, it was reported that PAG concentration is higher in first parity cows (Ricci et al., 2015, Mercadante et al., 2016). US exams may also be easier to conduct on first parity cows and, thus, Se of both tests may be higher in first parity cows and lower in older cows. Build a model where conditional dependency is a possibility. How many unknown parameters and degrees of freedom do you have? Are your Markov chains behaving OK? What do you observe? Try to play with the number of iterations to reach an effective sample size (ESS) > 3,000. How are your results differing from that of question 1 (where you assumed conditional independence)?
1. We could try to estimate the Se and Sp of the PAG-based ELISA test using US as the reference test. We could, however, explicitly acknowledge that the US exam is not a gold-standard test. At first, we could assume that these tests are conditionally independent. How many unknown parameters do you have to estimate and how many degrees of freedom do you have? Therefore, how many informative priors will you need? Estimate PAG’s Se and Sp, and true pregnancy prevalence using informative priors on US’ Se and Sp, and on pregnancy’s prevalence. Compare your results to those of Exercise 1 (question 4) where you estimated PAG’s Se and Sp, but assuming that US was a gold standard test.
Answer: We now have 5 unknown parameters (PAG’s Se and Sp, US’ Se and Sp, and pregnancy prevalence, P) and we have 3 degrees of freedom (i.e., one 2x2 table where three cells are providing valuable information and the content of the 4th cell can be deduced from the total number of observations minus the content of the other three cells). Thus, we need informative priors for at least 2 parameters. As suggested, we will use informative priors for 3 parameters ( US’ Se and Sp, and P), so the model is identifiable.
Table. Results assuming US is not perfect
| Parameters | Median | 95CI |
|---|---|---|
| PAG Se | 0.993 | 0.968, 1.000 |
| PAG Sp | 0.979 | 0.922, 0.999 |
| US Se | 0.923 | 0.885, 0.955 |
| US Sp | 0.970 | 0.941, 0.987 |
| Prevalence | 0.554 | 0.493, 0.614 |
As a comparison, in question 4 of Exercise 1, where we assumed that US was a perfect test, we assumed that US Se and Sp was 1.0 and we estimated accuracy of PAG as:
Table. Results assuming US is perfect
| Parameters | Median | 95CI |
|---|---|---|
| PAG Se | 0.988 | 0.961, 0.998 |
| PAG Sp | 0.915 | 0.859, 0.955 |
Thus, the 10 cows that were PAG+ and US- where all considered to be a failure of the PAG test (i.e., False+ on PAG). This resulted in a lower estimated PAG specificity. Now that we indicated that US is not perfect, these disagreements are not exclusively attributed to the PAG test, hence the higher estimated PAG specificity.
2. Now run the same model, but use vague priors on all unknown parameters. Do you think it will work? What do you observe regarding the Markov chains?
Answer: Beforehand, we would say no, it will not work! We have 5 unknown parameters and only 3 degrees of freedom. The model is non-identifiable. With some data sets, though, it will sometimes work (if the number of potential solutions is very limited). This is the case here, there is so little disagreement between tests that the Markov chains were able to converge. However, they have a strange behaviour, for all parameters, we always have two chains converging in one area and another one converging at the exact opposite (see figure below). This is a case where we have a mirror solution. When such situation arise, we could consider adding informative priors for at least one other parameter (to make the model identifiable). However, remember that mirror solutions may arise even in LCM that are identifiable. In this latter case, we would have to exclude the Markov chain producing the “biologically impossible” results or to modify the set of initial values of the problematic Markov chain.
Diagnostic plot - Mirror solution
3. Build a model where conditional dependency is a possibility. How many unknown parameters and degrees of freedom do you have? Are your Markov chains behaving OK? What do you observe? Try to play with the number of iterations to reach an effective sample size (ESS) > 3,000. How are your results differing from that of question 1 (where you assumed conditional independence)?
Answer: We have 7 unkown parameters (PAG’s Se and Sp, US’ Se and Sp, P, covp, and covn). We still only have 3 degrees of freedom, so we need informative priors on 4 parameters. We can use the Romano et al. (2006) derived priors for US’ Se and Sp, and use our pregnancy prevalence informative prior. We can use the natural bounds of covp and covn proposed by Dendukuri and Joseph (2001) as priors for these parameters.
We initially ran the LCM for 11,000 iterations (with a burn-in of 1,000). The Markov chains were still cycling up and down for some of the parameters (Se_PAG and covp were the worst; see Se_PAG below). But it is still cycling in a very tight space (i.e., check the Y-axis scale). Autocorrelation seems to be a problem (especially for Se_pag and covp, again).
Diagnostic plot - 11,000 iterations
The ESS for these two parameters were both around 170. So, despite the 30,000 values (3 chains times 11,000 iterations, minus the 1,000 burn-in of each chain) stored during the Monte Carlo simulation, there is just the equivalent of 170 independent values. This would be quite a small sample to report with precision the median, 2.5th, and 97.5th estimates. Using the exact same model, but with a number of iteration of 201,000, we obtained ESS >3,500 for all parameters.
Table. Results of the LCM assuming or not conditional independence between diagnostic tests.
| Conditional independence LCM | Conditional dependence LCM | ||||
|---|---|---|---|---|---|
| Parameters | Median | 95CI | Median | 95CI | |
| PAG Se | 0.993 | 0.968, 1.00 | 0.952 | 0.866, 0.998 | |
| PAG Sp | 0.979 | 0.922, 0.999 | 0.924 | 0.833, 0.988 | |
| US Se | 0.923 | 0.885, 0.955 | 0.906 | 0.847, 0.947 | |
| US Sp | 0.970 | 0.941, 0.987 | 0.951 | 0.899, 0.981 | |
| covp | NA | NA | 0.035 | 0.000, 0.104 | |
| covn | NA | NA | 0.032 | 0.002, 0.079 | |
| Prevalence | 0.554 | 0.493, 0.614 | 0.555 | 0.479, 0.634 |
When we compare the LCM assuming or not conditional independence, we can see that the tests’ Se and Sp estimates are slightly lower in the model allowing for conditional dependence. Indeed, in that model the covariance terms, covp and covn, where small, but positive. Remember, in the model allowing for conditional dependence, the agreement between tests is now due to the PAG and US accuracy parameters (e.g., their Se) plus a small value due to their covariance (e.g., covp). Therefore, the tests accuracy parameters estimates will be reduced as their covariance is increased.
In some situations, the diagnostic tests will be apply to two or more populations with different disease prevalence. In such cases, it could be reasonable to assume that:
It could then be advantageous to model the different populations simultaneously in the same LCM as described by Hui and Walter (1980). With two populations, we would have six unknown parameters (SeA and SpA, SeB and SpB, and each population’s prevalence, P1 and P2). When conducting two diagnostic tests in two populations, the data generated can be presented as two 2x2 tables presenting the cross-classified results of the two tests.
Table. Cross-classified results from two diagnostic tests in two populations.
| Population 1 | Population 2 | |||||
|---|---|---|---|---|---|---|
| Test A+ | Test A- | Test A+ | Test A- | |||
| Test B+ | Pop1 n1 | Pop1 n3 | Test B+ | Pop2 n1 | Pop2 n3 | |
| Test B- | Pop1 n2 | Pop1 n4 | Test B- | Pop2 n2 | Pop2 n4 |
We can see from these 2x2 tables that we have 6 degrees of freedom available. Indeed, in each of the 2x2 table, we have three cells that do contribute a valuable information and the information in the 4th cell can be “guessed” using the total number of observations minus the content of the other three cells. The model proposed by Hui and Walter (1980) is, therefore, barely identifiable.
Cheung et al. (2021) have described how the number of unknown parameters and degrees of freedom vary as function of the number of populations studied when comparing two conditionally independent diagnostic tests (see below).
Table. Number of degrees of freedom and of unknown parameters when comparing two conditionally independent diagnostic tests across various number of populations (adapted from Cheung et al. (2021)).
| Number of Populations | Number of unknown parameters (np) | Number of degrees of freedom (df) | Minimum number of informative priors needed |
|---|---|---|---|
| 1 | 5 | 3 | 2 |
| 2 | 6 | 6 | 0 |
| 3 | 7 | 9 | 0 |
| 4 | 8 | 12 | 0 |
| p | 4+p | 3*p | if df < np, then np-df |
With two populations, the likelihood function that could be used to link the observed data (Pop1 n1, Pop1 n2, …, Pop2 n4) to the unknown parameters (SeA and SpA, SeB and SpB, and P1 and P2) can be described as follows:
\(Ppop1[1:4] \sim dmulti(Pop1[1:4],
nPop1)\)
\(Ppop1_1 = P1*SeA*SeB +
(1-P1)*(1-SpA)*(1-SpB)\)
\(Ppop1_2 = P1*SeA*(1-SeB) +
(1-P1)*(1-SpA)*SpB\)
\(Ppop1_3 = P1*(1-SeA)*SeB +
(1-P1)*SpA*(1-SpB)\)
\(Ppop1_4 = P1*(1-SeA)*(1-SeB) +
(1-P1)*SpA*SpB\)
\(Ppop2[1:4] \sim dmulti(Pop2[1:4],
nPop2)\)
\(Ppop2_1 = P2*SeA*SeB +
(1-P2)*(1-SpA)*(1-SpB)\)
\(Ppop2_2 = P2*SeA*(1-SeB) +
(1-P2)*(1-SpA)*SpB\)
\(Ppop2_3 = P2*(1-SeA)*SeB +
(1-P2)*SpA*(1-SpB)\)
\(Ppop2_4 = P2*(1-SeA)*(1-SeB) +
(1-P2)*SpA*SpB\)
Where \(Ppop1_1\) to \(Ppop1_4\) and \(Ppop2_1\) to \(Ppop2_4\) are the probabilities of falling in a given cell of the 2x2 table in population 1 and population 2, respectively; and \(nPop1\) and \(nPop2\) are the number of individuals tested in population 1 and 2, respectively. Thus, if you look carefully, we simply duplicated the likelihood function used before and we added subscripts to differentiate the cells, probabilities, and disease prevalence of the two populations. Pretty cool right? If you have three, four, etc populations, you can then extend the likelihood function to include them.
To run such a model, we simply need to provide, for each population, a dataset where n1, n2, n3, and n4 are listed (in that order). For instance, if we use herd #1 and herd #2 from the PAG vs. US study, we will have the following 2x2 tables:
Table. Cross-classified results from the PAG and US diagnostic tests in herd #1 and herd #2.
| Herd 1 | Total | Herd 2 | Total | |||||
|---|---|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | |||||
| PAG+ | 138 | 10 | PAG+ | 71 | 8 | |||
| PAG- | 1 | 113 | PAG- | 0 | 69 | |||
| Total | 262 | Total | 148 |
The dataset could, thus be created as follows:
#n is of the form : (TestA pos and TestB pos), (TestA pos and TestB neg), (TestA neg and TestB pos), then (TestA neg and TestB neg)
datalist <- list(Pop1=c(138,1,10,113),
Pop2=c(71, 0, 8, 69)
)
We could provide the values that will be used to described the prior distributions as we did before (so they are included in the model text file). The only difference is that we now have two prevalence parameters (P1 and P2) to describe. In the example below, I will use vague priors for all parameters (just because we can!).
#We could first create labels for TestA and TestB
TestA <- "US"
TestB <- "PAG"
#Provide information for the prior distributions (all beta distributions) for the 6 unknown parameters
Prev1.shapea <- 1 #a shape parameter for Prev in population 1
Prev1.shapeb <- 1 #b shape parameter for Prev in population 1
Prev2.shapea <- 1 #a shape parameter for Prev in population 2
Prev2.shapeb <- 1 #b shape parameter for Prev in population 2
Se.TestA.shapea <- 1 #a shape parameter for Se of TestA
Se.TestA.shapeb <- 1 #b shape parameter for Se of TestA
Sp.TestA.shapea <- 1 #a shape parameter for Sp of TestA
Sp.TestA.shapeb <- 1 #b shape parameter for Sp of TestA
Se.TestB.shapea <- 1 #a shape parameter for Se of TestB
Se.TestB.shapeb <- 1 #b shape parameter for Se of TestB
Sp.TestB.shapea <- 1 #a shape parameter for Sp of TestB
Sp.TestB.shapeb <- 1 #b shape parameter for Sp of TestB
#I will also need the total number of individuals tested in each population (nPop1 and nPop2)
n <- sapply(datalist, sum)
nPop1 <- n[1]
nPop2 <- n[2]
With that, we have everything that is needed to write the JAGS model.
#Create the JAGS text file
model_2tests_2pop_indep <- paste0("model{
#=== LIKELIHOOD ===#
#=== POPULATION 1 ===#
Pop1[1:4] ~ dmulti(p1[1:4], ",nPop1,")
p1[1] <- Prev1*Se_", TestA, "*Se_", TestB, " + (1-Prev1)*(1-Sp_", TestA, ")*(1-Sp_", TestB, ")
p1[2] <- Prev1*Se_", TestA, "*(1-Se_", TestB, ") + (1-Prev1)*(1-Sp_", TestA, ")*Sp_", TestB, "
p1[3] <- Prev1*(1-Se_", TestA, ")*Se_", TestB, " + (1-Prev1)*Sp_", TestA, "*(1-Sp_", TestB, ")
p1[4] <- Prev1*(1-Se_", TestA, ")*(1-Se_", TestB, ") + (1-Prev1)*Sp_", TestA, "*Sp_", TestB, "
#=== POPULATION 2 ===#
Pop2[1:4] ~ dmulti(p2[1:4], ",nPop2,")
p2[1] <- Prev2*Se_", TestA, "*Se_", TestB, " + (1-Prev2)*(1-Sp_", TestA, ")*(1-Sp_", TestB, ")
p2[2] <- Prev2*Se_", TestA, "*(1-Se_", TestB, ") + (1-Prev2)*(1-Sp_", TestA, ")*Sp_", TestB, "
p2[3] <- Prev2*(1-Se_", TestA, ")*Se_", TestB, " + (1-Prev2)*Sp_", TestA, "*(1-Sp_", TestB, ")
p2[4] <- Prev2*(1-Se_", TestA, ")*(1-Se_", TestB, ") + (1-Prev2)*Sp_", TestA, "*Sp_", TestB, "
#=== PRIOR ===#
Prev1 ~ dbeta(",Prev1.shapea,", ",Prev1.shapeb,") ## Prior for Prevalence in population 1
Prev2 ~ dbeta(",Prev2.shapea,", ",Prev2.shapeb,") ## Prior for Prevalence in population 2
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
}")
#write as a text (.txt) file
write.table(model_2tests_2pop_indep,
file="model_2tests_2pop_indep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
With this code, you could, again, simply modify the labels for Test A and Test B, and the shape parameters for the prior distributions. The text file with the JAGS model will automatically be updated. Currently, it looks like this:
Text file with the model for 2 independent diagnostic tests applied to two populations.
Again, we will need to provide a list of initial values (one per Markov chain) for all unknown parameters. Careful again, we now have two prevalence (Prev1 and Prev2).
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev1=0.50,
Prev2=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG=0.90,
Sp_PAG=0.90),
list(Prev1=0.60,
Prev2=0.60,
Se_US=0.80,
Sp_US=0.80,
Se_PAG=0.80,
Sp_PAG=0.80),
list(Prev1=0.40,
Prev2=0.40,
Se_US=0.70,
Sp_US=0.70,
Se_PAG=0.70,
Sp_PAG=0.70)
)
We can run the model using jags() function as seen
before.
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_2tests_2pop_indep.txt",
parameters.to.save=c("Prev1", "Prev2", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG"),
n.chains=3,
inits=inits,
n.iter=11000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (results not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
As discussed before, an important assumption of the LCM described in the previous section is that diagnostic tests are conditionally independent. This assumption, however, can be relaxed using the method proposed by Dendukuri and Joseph (2001). Note that, we are now increasing the number of unknown parameters and we will, therefore, have to provide additional informative distributions.
Table. Number of degrees of freedom and of unknown parameters when comparing two conditionally dependent diagnostic tests across various number of populations (adapted from Cheung et al. (2021)).
| Number of Populations | Number of unknown parameters (np) | Number of degrees of freedom (df) | Minimum number of informative priors needed |
|---|---|---|---|
| 1 | 7 | 3 | 4 |
| 2 | 8 | 6 | 2 |
| 3 | 9 | 9 | 0 |
| 4 | 10 | 12 | 0 |
| p | 4+2+p | 3*p | if df < np, then np-df |
For this example, in addition to specifying priors for covp and covn, I will add informative priors on US Se and Sp.
library(epiR)
# US sensitivity ----------------------------------------------------------
# Sensitivity of US: Mode=0.90, and we are 97.5% sure >0.85
Se.US <- epi.betabuster(mode=0.90, conf=0.975, greaterthan=T, x=0.85)
# US specificity ----------------------------------------------------------
# Specificity of US: Mode=0.95, and we are 97.5% sure >0.90
Sp.US <- epi.betabuster(mode=0.95, conf=0.975, greaterthan=T, x=0.90)
Se.TestA.shapea <- Se.US$shape1 #a shape parameter for Se of TestA
Se.TestA.shapeb <- Se.US$shape2 #b shape parameter for Se of TestA
Sp.TestA.shapea <- Sp.US$shape1 #a shape parameter for Sp of TestA
Sp.TestA.shapeb <- Sp.US$shape2 #b shape parameter for Sp of TestA
Now, we could add the covp and covn parameters in the likelihood function, and specified prior distributions for these as we did before:
#Create the JAGS text file
model_2tests_2pop_dep <- paste0("model{
#=== LIKELIHOOD ===#
#=== POPULATION 1 ===#
Pop1[1:4] ~ dmulti(p1[1:4], ",nPop1,")
p1[1] <- Prev1*(Se_", TestA, "*Se_", TestB, " + covp) + (1-Prev1)*((1-Sp_", TestA, ")*(1-Sp_", TestB, ") + covn)
p1[2] <- Prev1*(Se_", TestA, "*(1-Se_", TestB, ") - covp) + (1-Prev1)*((1-Sp_", TestA, ")*Sp_", TestB, " - covn)
p1[3] <- Prev1*((1-Se_", TestA, ")*Se_", TestB, " - covp) + (1-Prev1)*(Sp_", TestA, "*(1-Sp_", TestB, ") - covn)
p1[4] <- Prev1*((1-Se_", TestA, ")*(1-Se_", TestB, ") + covp) + (1-Prev1)*(Sp_", TestA, "*Sp_", TestB, " + covn)
#=== POPULATION 2 ===#
Pop2[1:4] ~ dmulti(p2[1:4], ",nPop2,")
p2[1] <- Prev2*(Se_", TestA, "*Se_", TestB, " + covp) + (1-Prev2)*((1-Sp_", TestA, ")*(1-Sp_", TestB, ") + covn)
p2[2] <- Prev2*(Se_", TestA, "*(1-Se_", TestB, ") - covp) + (1-Prev2)*((1-Sp_", TestA, ")*Sp_", TestB, " - covn)
p2[3] <- Prev2*((1-Se_", TestA, ")*Se_", TestB, " - covp) + (1-Prev2)*(Sp_", TestA, "*(1-Sp_", TestB, ") - covn)
p2[4] <- Prev2*((1-Se_", TestA, ")*(1-Se_", TestB, ") + covp) + (1-Prev2)*(Sp_", TestA, "*Sp_", TestB, " + covn)
#=== PRIOR ===#
Prev1 ~ dbeta(",Prev1.shapea,", ",Prev1.shapeb,") ## Prior for Prevalence in population 1
Prev2 ~ dbeta(",Prev2.shapea,", ",Prev2.shapeb,") ## Prior for Prevalence in population 2
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
#=== CONDITIONAL DEPENDENCE STRUCTURE ===#
covp ~ dunif(minp,maxp)
covn ~ dunif(minn,maxn)
minp <- (1-Se_", TestA, ")*(Se_", TestB, "-1)
minn <- (Sp_", TestA, "-1)*(1-Sp_", TestB, ")
maxp <- min(Se_", TestA, ",Se_", TestB, ") - Se_", TestA, "*Se_", TestB, "
maxn <- min(Sp_", TestA, ",Sp_", TestB, ") - Sp_", TestA, "*Sp_", TestB, "
}")
#write as a text (.txt) file
write.table(model_2tests_2pop_dep,
file="model_2tests_2pop_dep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
This text file looks like this:
Text file with the model for 2 DEPENDENT diagnostic tests applied to two populations.
Again, the list of initial values will have to be modified to include (covp and covn).
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev1=0.50,
Prev2=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG=0.90,
Sp_PAG=0.90,
covp=0.0,
covn=0.0),
list(Prev1=0.60,
Prev2=0.60,
Se_US=0.80,
Sp_US=0.80,
Se_PAG=0.80,
Sp_PAG=0.80,
covp=0.01,
covn=0.01),
list(Prev1=0.40,
Prev2=0.40,
Se_US=0.70,
Sp_US=0.70,
Se_PAG=0.70,
Sp_PAG=0.70,
covp=0.05,
covn=0.05)
)
And, when running the model using jags(), we would also
ask to monitor covp and covn. Note that I have
increased the number of iterations / burn-in period (remember the
autocorrelation problem seen with LCM with a covariance between
test?).
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_2tests_2pop_dep.txt",
parameters.to.save=c("Prev1", "Prev2", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG", "covp", "covn"),
n.chains=3,
inits=inits,
n.iter=210000,
n.burnin=10000,
n.thin=1,
DIC=FALSE)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 2
## Unobserved stochastic nodes: 8
## Total graph size: 72
##
## Initializing model
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (results not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
Open up the R project for Exercise 3 (i.e., the R
project file named Exercise 3.Rproj).
In the Exercise 3 folder, you will find partially completed
R scripts. To answer the questions, try to complete the
missing parts (they are highlighted by a #TO COMPLETE#
comment). We also provided complete R scripts, but try to
work it out on your own first!
We know that US is not a gold-standard test for diagnosing pregnancy in dairy cows between 28-45 days post-AI. Romano et al. (2006) have reported estimates of Se and Sp for US that we have used in the preceding exercises. However, these US accuracy estimates were themselves computed without comparing to a gold-standard, and the authors did not used a LCM methodology. The reported US accuracy estimates are, thus, possibly biased to some extent.
It would be great if we could compute Se and Sp estimates for the PAG test, but without using these potentially biased US accuracy estimates. In the complete study, we have results for cows from three herds. Assuming that pregnancy prevalence is sufficiently different between herds, we could possibly use the Hui and Walter model with 3 populations. Cross-tabulated PAG and US data from the 3 herds are presented below.
Table. Cross-classified results from the PAG and US diagnostic tests in the three herds studied.
| Herd 1 | Total | Herd 2 | Total | Herd 3 | Total | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | US+ | US- | ||||||||
| PAG+ | 138 | 10 | PAG+ | 71 | 8 | PAG+ | 46 | 3 | |||||
| PAG- | 1 | 113 | PAG- | 0 | 69 | PAG- | 1 | 37 | |||||
| Total | 262 | Total | 148 | Total | 87 |
1. Using a Hui and Walter model with these three populations, how many unknown parameters and degrees of freedom would you have, and how many informative priors will you need?
2. Based on the observed results (do not compute a model yet, just ballpark estimates), do you think pregnancy prevalence differ between the 3 herds? Do you have any biological reason to believe that PAG or US Se and/or Sp should not be the same in herd #1 vs. herd #2 vs. herd #3?
3. Start from the partially completed Question 3.R script located in Exercise 3 folder. Try to apply a Hui and Walter model for three populations to these data using informative pregnancy prevalence (mode=0.42, 97.5th percentile=0.74) and informative US accuracy priors (Se: mode=0.90, 2.5th percentile=0.85; Sp: mode=0.95, 2.5th percentile=0.90). For this analysis, consider that the two tests are conditionally independent. Then use the same LCM, but with vague priors for all parameters and compare your results.
1. Using a Hui and Walter model with these three populations, how many unknown parameters and degrees of freedom would you have, and how many informative priors will you need?
Answer: We would still have 2 Se and 2 Sp and now 3 prevalence (one for each herd) to estimate, thus 7 unknown parameters. For each herd we have 3 degrees of freedom; in total 9 degrees of freedom. We have 7 unknown parameters and 9 degrees of freedom, so we do not need any informative prior to estimate these parameters.
2. Based on the observed results (do not compute a model yet, just ballpark estimates), do you think pregnancy prevalence differ between the 3 herds? Do you have any biological reason to believe that PAG or US Se and/or Sp should not be the same in herd #1 vs. herd #2 vs. herd #3?
Answer: We do not know the true pregnancy prevalence in these herds, but we could explore the apparent prevalence observed using our tests under investigation. Here are the apparent prevalence estimates computed using either PAG or US results by herd.
Table. Apparent prevalence of pregnancy in 3 herds using either a PAG or US test.
| Herd | PAG-based prevalence | US-based prevalence |
|---|---|---|
| 1 | 0.56 | 0.53 |
| 2 | 0.53 | 0.48 |
| 3 | 0.56 | 0.54 |
These prevalences are fairly similar… In a real-life situation, I would probably try to find another way to split the dataset in different populations so we have more striking outcome prevalence differences between populations (e.g., cows with < 3 AI vs. cows with ≥ 3 AI; 1st lactation cows vs. older cows). But, for the sake of the exercise, we could still try to apply the Hui and Walter model to these data.
Regarding PAG or US changing Se and/or Sp between the three herds, it is difficult to imagine any reason that would affect PAG or US accuracy from one herd to another. We would probably be OK considering that, no matter the herd, tests Se and Sp are the same.
3. Try to apply a Hui and Walter model for three populations to these data using informative pregnancy prevalence (mode=0.42, 97.5th percentile=0.74) and informative US accuracy priors (Se: mode=0.90, 2.5th percentile=0.85; Sp: mode=0.95, 2.5th percentile=0.90). For this analysis, consider that the two tests are conditionally independent. Then use the same LCM, but with vague priors for all parameters and compare your results.
Answer: The model with informative priors run very smoothly, as expected.
The model using vague priors for all parameters also converged very easily, autocorrelation is not a problem, etc. Isn’t it a bit surprising? We would have expected that, with prevalence that are not overly different between the three herds, using the Hui and Walter model would not necessarily make the model identifiable. In this specific example, however, we have two very precise tests and, therefore, limited disagreement between tests (check the three 2x2 tables). Therefore, the number of potential solutions is very limited and the model is still identifiable even though we do not have striking prevalence differences between herds. It may not work as well in a different dataset. Below, I have summarize the results of the LCM with informative vs. vague priors.
Table. Results from a 2 tests, 3 populations LCM using different priors (informative on pregnancy prevalence and on US Se and Sp vs. vague priors on all parameters).
| Parameter | Informative | Vague | |||
|---|---|---|---|---|---|
| Prior | Posterior | Prior | Posterior | ||
| Prev1 | Beta(4.2, 5.4) | 0.554 (0.493, 0.614) | Beta (1.0, 1.0) | 0.547 (0.482, 0.612) | |
| Prev2 | Beta(4.2, 5.4) | 0.517 (0.436, 0.595) | Beta (1.0, 1.0) | 0.505 (0.418, 0.590) | |
| Prev3 | Beta(4.2, 5.4) | 0.547 (0.444, 0.646) | Beta (1.0, 1.0) | 0.551 (0.445, 0.654) | |
| Se US | Beta (100, 12) | 0.926 (0.893, 0.955) | Beta (1.0, 1.0) | 0.961 (0.908, 0.998) | |
| Sp US | Beta (100, 6.2) | 0.977 (0.956, 0.990) | Beta (1.0, 1.0) | 0.993 (0.972, 1.000) | |
| Se PAG | Beta (1.0, 1.0) | 0.996 (0.980, 1.000) | Beta (1.0, 1.0) | 0.994 (0.976, 1.000) | |
| Sp PAG | Beta (1.0, 1.0) | 0.980 (0.956, 0.990) | Beta (1.0, 1.0) | 0.949 (0.891, 0.997) |
We have small difference between approaches. In the LCM with informative priors, we indicated, as prior knowledge, that US Se and Sp were around 0.90 and 0.95, respectively. Our posterior estimates for these parameters were 0.93 and 0.98. So, our data indicates that US Se and Sp is possibly better than what was reported by Romano et al.. This is confirmed in the model were we used vague priors. In that latter LCM, only the data were used to provide Se and Sp estimates, and, in that model, US Se and Sp were estimated at 0.96 and 0.99, respectively.
In the LCM with vague priors, Se of US was estimated to 0.96 (vs. 0.93), as a result, the PAG Sp was slightly reduced (0.95 vs. 0.98). In other words, in the LCM with informative priors, most of the US-/PAG+ results were attributed to errors of the US exam (because of the prior US Se with mode of 0.90). In the LCM with vague priors, the US-/PAG+ disagreements were distributed more equally between an error of the PAG vs. an error of the US test.
In some situations, individuals studied will be tested using more than two diagnostic tests. For instance, if all individuals were tested using three different diagnostic tests (Test A, Test B, and Test C) we could present the cross-classified results of the three tests as follows:
Table. Cross-classified results from three diagnostic tests in one population.
| Test C+ | Test C- | |||||
|---|---|---|---|---|---|---|
| Test A+ | Test A- | Test A+ | Test A- | |||
| Test B+ | n1 | n3 | Test B+ | n5 | n7 | |
| Test B- | n2 | n4 | Test B- | n6 | n8 |
Contrarily to the case where we had two tests in 2 populations, we can see from the preceding table that we have 7 degrees of freedom available. Indeed, we need to know the total number of individuals (n) and the content of seven of the eight cells to “guess” the value in the last cell. Moreover, with three diagnostic tests, we would have seven unknown parameters (SeA and SpA, SeB and SpB, SeC and SpC,and the population’s prevalence, P) to estimate. The model is, therefore, identifiable. Again, we can refer to Cheung et al. (2021) to describe how the number of unknown parameters and degrees of freedom vary as function of the number of diagnostic tests used (see below).
Table. Number of degrees of freedom and of unknown parameters when comparing an increasing number of conditionally independent diagnostic tests in one single population (adapted from Cheung et al. (2021)).
| Number of Tests | Number of unknown parameters (np) | Number of degrees of freedom (df) | Minimum number of informative priors needed |
|---|---|---|---|
| 2 | 5 | 3 | 2 |
| 3 | 7 | 7 | 0 |
| 4 | 9 | 15 | 0 |
| k | 2*k+1 | 2^k -1 | if df < np, then np-df |
As you can see, adding diagnostic tests could be an extremely valuable option to make a LCM identifiable. The gain in statistical power, however, will be maximized if:
-We used many conditionally independent diagnostic
tests
-The sample size is sufficient so all tests combinations are
represented (i.e., no empty cells in the table presenting the
cross-classified results)
With three diagnostic tests applied to a single population, the likelihood function that could be used to link the observed data (n1, …, n8) to the unknown parameters (SeA and SpA, SeB and SpB, SeC and SpC,and P) can be described as follows:
\(n[1:8] \sim dmulti(P[1:8], n)\)
\(P1 = P*SeA*SeB*SeC +
(1-P)*(1-SpA)*(1-SpB)*(1-SpC)\)
\(P2 = P*SeA*(1-SeB)*SeC +
(1-P)*(1-SpA)*SpB*(1-SpC)\)
\(P3 = P*(1-SeA)*SeB*SeC +
(1-P)*SpA*(1-SpB)*(1-SpC)\)
\(P4 = P*(1-SeA)*(1-SeB)*SeC +
(1-P)*SpA*SpB*(1-SpC)\)
\(P5 = P*SeA*SeB*(1-SeC) +
(1-P)*(1-SpA)*(1-SpB)*SpC\)
\(P6 = P*SeA*(1-SeB)*(1-SeC) +
(1-P)*(1-SpA)*SpB*SpC\)
\(P7 = P*(1-SeA)*SeB*(1-Sec) +
(1-P)*SpA*(1-SpB)*SpC\)
\(P8 = P*(1-SeA)*(1-SeB)*(1-SeC) +
(1-P)*SpA*SpB*SpC\)
Again, if we put that in words, the number of individuals in cell n1 (i.e., TestA+ and TestB+ and TestC+) is determined by the number of individuals tested (n) and the probability (P1) of falling in that cell. The latter itself depends on the true prevalence of disease in the population (P), and on each of the tests sensitivity (SeA, SeB, and SeC) and specificity (SpA, SpB, and SpC). We can use similar reasoning for n2 to n8. Finally, the multinomial distribution would further impose that the eight probabilities (P1 to P8) have to sum up to 1.0 (i.e., 100%).
To run such a model, we simply need to provide a dataset where n1, n2, n3, n4, n5, n6, n7, and n8 are listed (in that order). For instance, in the PAG vs. US study, we have a number of cows that also had a conventional transrectal exam (TRE). The results of the three tests can be presented as follows:
Table. Cross-classified results from a PAG, US, and TRE diagnostic tests in one population.
| TRE+ | TRE- | |||||
|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | |||
| PAG+ | 240 | 12 | PAG+ | 15 | 9 | |
| PAG- | 1 | 2 | PAG- | 1 | 216 |
The dataset could, thus be created as follows:
#n is of the form : (TestA+ TestB+ Test C+), (TestA+ TestB- TestC+), (TestA- TestB+ TestC+), (TestA- TestB- TestC+), (TestA+ TestB+ TestC-), (TestA+ TestB- TestC-), (TestA- TestB+ TestC-), (TestA- Test- TestC-)
datalist <- list(n=c(240, 1, 12, 2, 15, 1, 9, 216))
Again, we could provide the values that will be used to describe the prior distributions as we did before (they will then become included in the LCM text file). The only difference is that we now have three Se and Sp to describe. In the example below, I will use vague priors for all parameters.
#We could first create labels for TestA, TestB, and TestC
TestA <- "US"
TestB <- "PAG"
TestC <- "TRE"
#Provide information for the prior distributions (all beta distributions) for the 7 unknown parameters
Prev.shapea <- 1 #a shape parameter for Prev
Prev.shapeb <- 1 #b shape parameter for Prev
Se.TestA.shapea <- 1 #a shape parameter for Se of TestA
Se.TestA.shapeb <- 1 #b shape parameter for Se of TestA
Sp.TestA.shapea <- 1 #a shape parameter for Sp of TestA
Sp.TestA.shapeb <- 1 #b shape parameter for Sp of TestA
Se.TestB.shapea <- 1 #a shape parameter for Se of TestB
Se.TestB.shapeb <- 1 #b shape parameter for Se of TestB
Sp.TestB.shapea <- 1 #a shape parameter for Sp of TestB
Sp.TestB.shapeb <- 1 #b shape parameter for Sp of TestB
Se.TestC.shapea <- 1 #a shape parameter for Se of TestC
Se.TestC.shapeb <- 1 #b shape parameter for Se of TestC
Sp.TestC.shapea <- 1 #a shape parameter for Sp of TestC
Sp.TestC.shapeb <- 1 #b shape parameter for Sp of TestC
#I will also need the total number of individuals tested
n <- sapply(datalist, sum)
We now have everything that is needed to write the JAGS model.
#Create the JAGS text file
model_3tests_1pop_indep <- paste0("model{
#=== LIKELIHOOD ===#
n[1:8] ~ dmulti(p[1:8], ",n,")
p[1] <- Prev*Se_", TestA, "*Se_", TestB, "*Se_", TestC, " + (1-Prev)*(1-Sp_", TestA, ")*(1-Sp_", TestB, ")*(1-Sp_", TestC,")
p[2] <- Prev*Se_", TestA, "*(1-Se_", TestB, ")*Se_", TestC, " + (1-Prev)*(1-Sp_", TestA, ")*Sp_", TestB, "*(1-Sp_", TestC,")
p[3] <- Prev*(1-Se_", TestA, ")*Se_", TestB, "*Se_", TestC, " + (1-Prev)*Sp_", TestA, "*(1-Sp_", TestB, ")*(1-Sp_", TestC,")
p[4] <- Prev*(1-Se_", TestA, ")*(1-Se_", TestB, ")*Se_", TestC, " + (1-Prev)*Sp_", TestA, "*Sp_", TestB, "*(1-Sp_", TestC,")
p[5] <- Prev*Se_", TestA, "*Se_", TestB, "*(1-Se_", TestC,") + (1-Prev)*(1-Sp_", TestA, ")*(1-Sp_", TestB, ")*Sp_", TestC,"
p[6] <- Prev*Se_", TestA, "*(1-Se_", TestB, ")*(1-Se_", TestC,") + (1-Prev)*(1-Sp_", TestA, ")*Sp_", TestB, "*Sp_", TestC,"
p[7] <- Prev*(1-Se_", TestA, ")*Se_", TestB, "*(1-Se_", TestC,") + (1-Prev)*Sp_", TestA, "*(1-Sp_", TestB, ")*Sp_", TestC,"
p[8] <- Prev*(1-Se_", TestA, ")*(1-Se_", TestB, ")*(1-Se_", TestC,") + (1-Prev)*Sp_", TestA, "*Sp_", TestB, "*Sp_", TestC,"
#=== PRIOR ===#
Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,") ## Prior for Prevalence
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
Se_", TestC, " ~ dbeta(",Se.TestC.shapea,", ",Se.TestC.shapeb,") ## Prior for Se of Test C
Sp_", TestC, " ~ dbeta(",Sp.TestC.shapea,", ",Sp.TestC.shapeb,") ## Prior for Sp of Test C
}")
#write as a text (.txt) file
write.table(model_3tests_1pop_indep,
file="model_3tests_1pop_indep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
With this code, you could, again, simply modify the labels for Test A, Test B, and Test C, and the shape parameters for the prior distributions. The text file with the JAGS model will automatically be updated. Currently, it looks like this:
Text file with the LCM for 3 independent diagnostic tests applied to one population.
Again, we will need to provide a list of initial values (one per Markov chain) for all unknown parameters. Careful again, we now have three tests.
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG=0.90,
Sp_PAG=0.90,
Se_TRE=0.90,
Sp_TRE=0.90),
list(Prev=0.60,
Se_US=0.80,
Sp_US=0.80,
Se_PAG=0.80,
Sp_PAG=0.80,
Se_TRE=0.80,
Sp_TRE=0.80),
list(Prev=0.40,
Se_US=0.70,
Sp_US=0.70,
Se_PAG=0.70,
Sp_PAG=0.70,
Se_TRE=0.70,
Sp_TRE=0.70)
)
We can run the model using jags() function as seen
before.
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_3tests_1pop_indep.txt",
parameters.to.save=c("Prev", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG", "Se_TRE", "Sp_TRE"),
n.chains=3,
inits=inits,
n.iter=11000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (results not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
With LCM where >2 diagnostic tests are used, we could, again, relax the assumption of conditional independence by adding covariance terms representing the conditional dependence between tests (Dendukuri and Joseph, 2001). However, a LCM including every covariance term between every test is rarely feasible. A more pragmatic approach is to consider pair-wise covariance terms between two tests.
For instance, using the example from the preceding section, we could
assume that US and TRE, which are both based on recognition (tactile or
visual) of the amniotic vesicle or embryo, are possibly conditionally
dependent (as compared to the PAG test, which measure a
glycoprotein associated with the placenta). We could thus include
covariance terms between these two specific tests. Remember, we had set
up US as Test A and TRE as Test C. It will be easier, however, to manage
the covariance terms if the dependent tests are next to each other
within the likelihood function. I will, therefore, re-assign the tests
as follows, and reorganized the dataset with that new tests
configuration.
-Test A: US
-Test B: TRE
-Test C: PAG
TestA <- "US"
TestB <- "TRE"
TestC <- "PAG"
#n is of the form : (TestA+ TestB+ Test C+), (TestA+ TestB- TestC+), (TestA- TestB+ TestC+), (TestA- TestB- TestC+), (TestA+ TestB+ Test C-), (TestA+ TestB- TestC-), (TestA- TestB+ TestC-), (TestA- TestB- TestC-)
datalist <- list(n=c(240, 15, 12, 9, 1, 1, 2, 216))
n <- sapply(datalist, sum)
#Create the JAGS text file
model_3tests_1pop_dep <- paste0("model{
#=== LIKELIHOOD ===#
n[1:8] ~ dmulti(p[1:8], ",n,")
p[1] <- Prev*(Se_", TestA, "*Se_", TestB, " + covp)*Se_", TestC, " + (1-Prev)*((1-Sp_", TestA, ")*(1-Sp_", TestB, ") + covn)*(1-Sp_", TestC,")
p[2] <- Prev*(Se_", TestA, "*(1-Se_", TestB, ") - covp)*Se_", TestC, " + (1-Prev)*((1-Sp_", TestA, ")*Sp_", TestB, " - covn)*(1-Sp_", TestC,")
p[3] <- Prev*((1-Se_", TestA, ")*Se_", TestB, " - covp)*Se_", TestC, " + (1-Prev)*(Sp_", TestA, "*(1-Sp_", TestB, ") - covn)*(1-Sp_", TestC,")
p[4] <- Prev*((1-Se_", TestA, ")*(1-Se_", TestB, ") + covp)*Se_", TestC, " + (1-Prev)*(Sp_", TestA, "*Sp_", TestB, " + covn)*(1-Sp_", TestC,")
p[5] <- Prev*(Se_", TestA, "*Se_", TestB, " + covp)*(1-Se_", TestC,") + (1-Prev)*((1-Sp_", TestA, ")*(1-Sp_", TestB, ") + covn)*Sp_", TestC,"
p[6] <- Prev*(Se_", TestA, "*(1-Se_", TestB, ") - covp)*(1-Se_", TestC,") + (1-Prev)*((1-Sp_", TestA, ")*Sp_", TestB, " - covn)*Sp_", TestC,"
p[7] <- Prev*((1-Se_", TestA, ")*Se_", TestB, " - covp)*(1-Se_", TestC,") + (1-Prev)*(Sp_", TestA, "*(1-Sp_", TestB, ") - covn)*Sp_", TestC,"
p[8] <- Prev*((1-Se_", TestA, ")*(1-Se_", TestB, ") + covp)*(1-Se_", TestC,") + (1-Prev)*(Sp_", TestA, "*Sp_", TestB, " + covn)*Sp_", TestC,"
#=== PRIOR ===#
Prev ~ dbeta(",Prev.shapea,", ",Prev.shapeb,") ## Prior for Prevalence
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, " ~ dbeta(",Se.TestB.shapea,", ",Se.TestB.shapeb,") ## Prior for Se of Test B
Sp_", TestB, " ~ dbeta(",Sp.TestB.shapea,", ",Sp.TestB.shapeb,") ## Prior for Sp of Test B
Se_", TestC, " ~ dbeta(",Se.TestC.shapea,", ",Se.TestC.shapeb,") ## Prior for Se of Test C
Sp_", TestC, " ~ dbeta(",Sp.TestC.shapea,", ",Sp.TestC.shapeb,") ## Prior for Sp of Test C
#=== CONDITIONAL DEPENDENCE STRUCTURE ===#
covp ~ dunif(minp,maxp)
covn ~ dunif(minn,maxn)
minp <- (1-Se_", TestA, ")*(Se_", TestB, "-1)
minn <- (Sp_", TestA, "-1)*(1-Sp_", TestB, ")
maxp <- min(Se_", TestA, ",Se_", TestB, ") - Se_", TestA, "*Se_", TestB, "
maxn <- min(Sp_", TestA, ",Sp_", TestB, ") - Sp_", TestA, "*Sp_", TestB, "
}")
#write as a text (.txt) file
write.table(model_3tests_1pop_dep,
file="model_3tests_1pop_dep.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
The text file would look as follows:
Text file with the LCM for 3 diagnostic tests applied to one population and allowing for conditional dependence between two of the tests (US and TRE).
We could run the model using jags() function and then
produce the diagnostic plots, compute the ESS, and print out our results
as we did previously (results not shown).
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_3tests_1pop_dep.txt",
parameters.to.save=c("Prev", "Se_US", "Sp_US", "Se_PAG", "Sp_PAG", "Se_TRE", "Sp_TRE", "covp", "covn"),
n.chains=3,
inits=inits,
n.iter=11000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
Open up the R project for Exercise 4 (i.e., the R
project file named Exercise 4.Rproj).
In the Exercise 4 folder, you will find partially completed
R scripts. To answer the questions, try to complete the
missing parts (they are highlighted by a #TO COMPLETE#
comment). We also provided complete R scripts, but try to
work it out on your own first!
During the study comparing PAG to US, a third test (a trans-rectal manual exam; TRE) was conducted on most cows. To be honest, this third exam was conducted later (>50 days post-AI). But, for the sake of the exercise, you could consider that three diagnostic tests (PAG, US, and TRE) were all conducted at the same time point and develop a LCM for 3 diagnostic tests applied to one single population (i.e., herd #1).
Table. Cross-classified results from a PAG, US, and TRE diagnostic tests in herd #1.
| TRE+ | TRE- | |||||
|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | |||
| PAG+ | 126 | 3 | PAG+ | 12 | 7 | |
| PAG- | 0 | 0 | PAG- | 1 | 112 |
1. How many degrees of freedom and unknown parameters do you have? How many informative priors will be needed?
2. Start from the partially completed Question 2.R script located in Exercise 4 folder. Run a model using informative priors for US’ Se and Sp and pregnancy prevalence and with vague priors on all other parameters. Then run a second model with vague priors on all parameters. We will assume conditional independence between tests and we will also assume that this is one single population. Do you expect any problem with the model using vague priors on all parameters?
1. How many degrees of freedom and unknown parameters do you have? How many informative priors will be needed?
Answer: We have 7 degrees of freedom and 7 unknown parameters, so we do not need any informative priors.
2. Run a model using informative priors for US and pregnancy prevalence and with vague priors on all other parameters. Then run a second model with vague priors on all parameters. We will assume conditional independence between tests and we will also assume that this is one single population. Do you expect any problem with the model using vague priors on all parameters?
Answer: The first model with informative priors on US’ Se, Sp and on pregnancy prevalence ran well. The Markov chains were nice. With 30,000 iterations, I do get a decent ESS (around 5,000 for the worst parameter, Se_PAG). The median estimates and 95 CI are reported in the table below.
A priori, I would also expect that the LCM with vague priors on all parameters will work well, since it is identifiable. This was the case, the Markov chains were also nice. The lowest ESS was again around 5,000 (with 3 times 10,000 iterations). If it was for publication, we could run these LCM for more iterations to reach an ESS > 10,000 for all parameters. Results for the LCM with vague priors are also reported in the table below.
There were very little differences between the two LCM. Only the US’ Se and Sp estimates were slightly affected. The LCM with informative priors generated posterior estimates that were somewhere in between the prior estimates and the estimates from the LCM with vague priors (i.e., between the priors and the data).
Table. Results from a 3 tests, 1 population LCM using different priors (informative on pregnancy prevalence and on US Se and Sp vs. vague priors on all parameters).
| Parameter | Informative | Vague | |||
|---|---|---|---|---|---|
| Prior | Posterior | Prior | Posterior | ||
| Prev | Beta(4.2, 5.4) | 0.539 (0.479, 0.599) | Beta(1.0, 1.0) | 0.541 (0.481, 0.602) | |
| Se PAG | Beta(1.0, 1.0) | 0.995 (0.973, 1.00) | Beta(1.0, 1.0) | 0.995 (0.972, 1.00) | |
| Sp PAG | Beta(1.0, 1.0) | 0.942 (0.887, 0.978) | Beta(1.0, 1.0) | 0.938 (0.885, 0.974) | |
| Se US | Beta(100, 12) | 0.939 (0.904, 0.964) | Beta(1.0, 1.0) | 0.972 (0.933, 0.992) | |
| Sp US | Beta(100, 6.2) | 0.969 (0.940, 0.986) | Beta(1.0, 1.0) | 0.986 (0.953, 0.999) | |
| Se TRE | Beta(1.0, 1.0) | 0.907 (0.850, 0.949) | Beta(1.0, 1.0) | 0.909 (0.852, 0.950) | |
| Sp TRE | Beta(1.0, 1.0) | 0.994 (0.969, 1.00) | Beta(1.0, 1.0) | 0.994 (0.967, 1.00) |
So far we have (implicitly) assumed that tests’ accuracy (i.e., their Se and Sp) are the same for every individuals tested. Some diagnostic tests, however, may behave differently whether they are applied to individuals of a certain age, breed, etc. For instance, we could hypothesized that many of the diagnostic tests aiming at quantifying antibodies against a given pathogen will behave differently in neonates (which typically have an immature immune system) compared to juvenile or adult animals.
The situation where test accuracy varies as function of a characteristic of the individual tested (i.e., as function of a covariate) can be implemented relatively easily in the different LCM we have used so far. For instance, if we hypothesize that the accuracy of the PAG-based pregnancy test varies as function of whether the cow is a first lactation vs. and older cow, we would simply specify a LCM where we have two different PAG Se and Sp, one for first lactation cows (PAG_Se_first and PAG_Sp_first) and one for older cows (PAG_Se_old and PAG_Sp_old). You would also provide two sets of data (one for first lactation cows and one for older cows). Of course, in that specific example, you would now have three additional unknown parameters to estimate (one additional Se, one additional Sp, and one additional prevalence), with three additional degrees of freedom (one additional 2x2 table). So investigating whether test accuracy varies as function of a covariate will not help you make the LCM identifiable. You will still need to use another trick to make the model identifiable (e.g., informative priors, > 1 population, or >2 diagnostic tests).
If we have two diagnostic tests (Test A and Test B) applied to a single population, and if we assumed that the Se and Sp of one of the test (Test A) varies as function of a characteristic (Z) of the individual tested, then we would specify one likelihood function for the individuals without the characteristic (Z=0) and one likelihood function for the individual having the characteristic (Z=1). Moreover, we would estimate two Se and two Sp for the test which is hypothesized to vary as function of the covariate (Se_Z0, Sp_Z0, Se_Z1, Sp_Z1). Note that this could be extended to covariates with more than 2 levels. Thus, the likelihood function would become:
With Z=0:
\(n_0[1:4] \sim dmulti(P[1:4]_0,
n_0)\)
\(P1_0 = P_0*SeA_0*SeB +
(1-P_0)*(1-SpA_0)*(1-SpB)\)
\(P2_0 = P_0*SeA_0*(1-SeB) +
(1-P_0)*(1-SpA_0)*SpB\)
\(P3_0 = P_0*(1-SeA_0)*SeB +
(1-P_0)*SpA_0*(1-SpB)\)
\(P4_0 = P_0*(1-SeA_0)*(1-SeB) +
(1-P_0)*SpA_0*SpB\)
With Z=1:
\(n_1[1:4] \sim dmulti(P[1:4]_1,
n_1)\)
\(P1_1 = P_1*SeA_1*SeB +
(1-P_1)*(1-SpA_1)*(1-SpB)\)
\(P2_1 = P_1*SeA_1*(1-SeB) +
(1-P_1)*(1-SpA_1)*SpB\)
\(P3_1 = P_1*(1-SeA_1)*SeB +
(1-P_1)*SpA_1*(1-SpB)\)
\(P4_1 = P_1*(1-SeA_1)*(1-SeB) +
(1-P_1)*SpA_1*SpB\)
To run such a model, we simply need to provide a dataset presenting the cross tabulated results of the two tests for individuals with Z=0 and a second dataset for the individuals with Z=1. For instance, in the PAG vs. US study, the results of the PAG and US tests can be presented as function of the cow’s age as follows:
Table. Cross-classified results from a PAG and US in first lactation vs. older cows.
| First (Z=0) | Old (Z=1) | |||||
|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | |||
| PAG+ | 76 | 10 | PAG+ | 179 | 11 | |
| PAG- | 1 | 65 | PAG- | 1 | 153 |
The two datasets could be created as follows:
#n is of the form : (TestA pos and TestB pos), (TestA pos and TestB neg), (TestA neg and TestB pos), then (TestA neg and TestB neg)
datalist <- list(Z0=c(76,1 ,10, 65),
Z1=c(179, 1, 11, 153)
)
We could provide the values that will be used to described the prior distributions as we did before (so they are included in the model text file). The difference is that we now have, for one of the test, two Se and Sp and we also have one prevalence for Z=0 and one prevalence for Z=1. In the example below, I will use informative priors for these two prevalence (I will use the same priors for Z=0 and Z=1, but they could be different) and for US’ Se and Sp.
#I could first create labels for TestA and TestB
TestA <- "US"
TestB <- "PAG"
#I could create labels for Z=0 and Z=1
Z0 <- "First"
Z1 <- "Old"
#Provide information for the prior distributions (all beta distributions) for the unknown parameters
Prev0.shapea <- 4.2 #a shape parameter for Prev in Z=0
Prev0.shapeb <- 5.4 #b shape parameter for Prev in Z=0
Prev1.shapea <- 4.2 #a shape parameter for Prev in Z=1
Prev1.shapeb <- 5.4 #b shape parameter for Prev in Z=1
Se.TestA.shapea <- 131 #a shape parameter for Se of TestA
Se.TestA.shapeb <- 15 #b shape parameter for Se of TestA
Sp.TestA.shapea <- 100 #a shape parameter for Sp of TestA
Sp.TestA.shapeb <- 6 #b shape parameter for Sp of TestA
Se.TestB.Z0.shapea <- 1 #a shape parameter for Se of TestB in Z=0
Se.TestB.Z0.shapeb <- 1 #b shape parameter for Se of TestB in Z=0
Sp.TestB.Z0.shapea <- 1 #a shape parameter for Sp of TestB in Z=0
Sp.TestB.Z0.shapeb <- 1 #b shape parameter for Sp of TestB in Z=0
Se.TestB.Z1.shapea <- 1 #a shape parameter for Se of TestB in Z=1
Se.TestB.Z1.shapeb <- 1 #b shape parameter for Se of TestB in Z=1
Sp.TestB.Z1.shapea <- 1 #a shape parameter for Sp of TestB in Z=1
Sp.TestB.Z1.shapeb <- 1 #b shape parameter for Sp of TestB in Z=1
#I will also need the number of individuals tested in each populations (n0 and n1)
n <- sapply(datalist, sum)
n0 <- n[1]
n1 <- n[2]
With that, we have everything that is needed to write the JAGS model.
#Create the JAGS text file
model_2tests_covariate <- paste0("model{
#=== LIKELIHOOD ===#
#For Z=0
Z0[1:4] ~ dmulti(p0[1:4], ",n0,")
p0[1] <- Prev_", Z0,"*Se_", TestA, "*Se_", TestB, "_", Z0," + (1-Prev_", Z0,")*(1-Sp_", TestA, ")*(1-Sp_", TestB, "_", Z0,")
p0[2] <- Prev_", Z0,"*Se_", TestA, "*(1-Se_", TestB, "_", Z0,") + (1-Prev_", Z0,")*(1-Sp_", TestA, ")*Sp_", TestB, "_", Z0,"
p0[3] <- Prev_", Z0,"*(1-Se_", TestA, ")*Se_", TestB, "_", Z0," + (1-Prev_", Z0,")*Sp_", TestA, "*(1-Sp_", TestB, "_", Z0,")
p0[4] <- Prev_", Z0,"*(1-Se_", TestA, ")*(1-Se_", TestB, "_", Z0,") + (1-Prev_", Z0,")*Sp_", TestA, "*Sp_", TestB, "_", Z0,"
#For Z=1
Z1[1:4] ~ dmulti(p1[1:4], ",n1,")
p1[1] <- Prev_", Z1,"*Se_", TestA, "*Se_", TestB, "_", Z1," + (1-Prev_", Z1,")*(1-Sp_", TestA, ")*(1-Sp_", TestB, "_", Z1,")
p1[2] <- Prev_", Z1,"*Se_", TestA, "*(1-Se_", TestB, "_", Z1,") + (1-Prev_", Z1,")*(1-Sp_", TestA, ")*Sp_", TestB, "_", Z1,"
p1[3] <- Prev_", Z1,"*(1-Se_", TestA, ")*Se_", TestB, "_", Z1," + (1-Prev_", Z1,")*Sp_", TestA, "*(1-Sp_", TestB, "_", Z1,")
p1[4] <- Prev_", Z1,"*(1-Se_", TestA, ")*(1-Se_", TestB, "_", Z1,") + (1-Prev_", Z1,")*Sp_", TestA, "*Sp_", TestB, "_", Z1,"
#=== PRIOR ===#
Prev_", Z0," ~ dbeta(",Prev0.shapea,", ",Prev0.shapeb,") ## Prior for Prev in Z=0
Prev_", Z1," ~ dbeta(",Prev1.shapea,", ",Prev1.shapeb,") ## Prior for Prev in Z=1
Se_", TestA, " ~ dbeta(",Se.TestA.shapea,", ",Se.TestA.shapeb,") ## Prior for Se of Test A
Sp_", TestA, " ~ dbeta(",Sp.TestA.shapea,", ",Sp.TestA.shapeb,") ## Prior for Sp of Test A
Se_", TestB, "_", Z0, " ~ dbeta(",Se.TestB.Z0.shapea,", ",Se.TestB.Z0.shapeb,") ## Prior for Se of Test B in Z=0
Sp_", TestB, "_", Z0, " ~ dbeta(",Sp.TestB.Z0.shapea,", ",Sp.TestB.Z0.shapeb,") ## Prior for Sp of Test B in Z=0
Se_", TestB, "_", Z1, " ~ dbeta(",Se.TestB.Z1.shapea,", ",Se.TestB.Z1.shapeb,") ## Prior for Se of Test B in Z=1
Sp_", TestB, "_", Z1, " ~ dbeta(",Sp.TestB.Z1.shapea,", ",Sp.TestB.Z1.shapeb,") ## Prior for Sp of Test B in Z=1
}")
#write as a text (.txt) file
write.table(model_2tests_covariate,
file="model_2tests_covariate.txt",
quote=FALSE,
sep="",
row.names=FALSE,
col.names=FALSE)
With this code, you could, again, simply modify the labels for Test A and Test B, and for Z0 and Z1, and the shape parameters for the prior distributions. The text file with the JAGS model will automatically be updated. Currently, it looks like this:
Text file with the LCM for 2 independent diagnostic tests applied to one population but with one of the test accuracy varying as function of a covariate.
Again, we will need to provide a list of initial values (one per Markov chain) for all unknown parameters. Careful, we now have two tests, but two Se and Sp for one of the test, and two prevalence.
#Initializing values for the parameters Prev, and the Ses and Sps of the two tests for the 3 chains.
inits <- list(list(Prev_First=0.50,
Prev_Old=0.50,
Se_US=0.90,
Sp_US=0.90,
Se_PAG_First=0.90,
Sp_PAG_First=0.90,
Se_PAG_Old=0.90,
Sp_PAG_Old=0.90),
list(Prev_First=0.60,
Prev_Old=0.60,
Se_US=0.80,
Sp_US=0.80,
Se_PAG_First=0.80,
Sp_PAG_First=0.80,
Se_PAG_Old=0.80,
Sp_PAG_Old=0.80),
list(Prev_First=0.40,
Prev_Old=0.40,
Se_US=0.70,
Sp_US=0.70,
Se_PAG_First=0.70,
Sp_PAG_First=0.70,
Se_PAG_Old=0.70,
Sp_PAG_Old=0.70)
)
We can run the model using jags() function as seen
before.
library(R2jags)
library(coda)
#Run the Bayesian model
bug.out <- jags(data=datalist,
model.file="model_2tests_covariate.txt",
parameters.to.save=c("Prev_First", "Prev_Old","Se_US", "Sp_US", "Se_PAG_First", "Sp_PAG_First", "Se_PAG_Old", "Sp_PAG_Old"),
n.chains=3,
inits=inits,
n.iter=11000,
n.burnin=1000,
n.thin=1,
DIC=FALSE)
Then we could produce the diagnostic plots, compute the ESS, and print out our results as we did previously (results not shown).
library(mcmcplots)
bug.mcmc <- as.mcmc(bug.out)
mcmcplot(bug.mcmc, title="Diagnostic plots")
effectiveSize(bug.mcmc)
print(bug.out, digits.summary=3)
Open up the R project for Exercise 5 (i.e., the R
project file named Exercise 5.Rproj).
In the Exercise 5 folder, you will find partially completed
R scripts. To answer the questions, try to complete the
missing parts (they are highlighted by a #TO COMPLETE#
comment). We also provided complete R scripts, but try to
work it out on your own first!
Some anecdotic reports suggest that expression of PAG may vary as function of number of days of pregnancy. We could, therefore, hypothesize that PAG’s Se and Sp may vary if the test is conducted early (≤ 35 days post-breeding) compared to later (>35 days post-breeding). When cross-tabulating the studies results for the PAG and US tests as function of days since breeding, we get:
Table. Cross-classified results from a PAG and US test as function of number of days since breeding.
| ≤ 35 days | >35 days | |||||
|---|---|---|---|---|---|---|
| US+ | US- | US+ | US- | |||
| PAG+ | 143 | 11 | PAG+ | 112 | 10 | |
| PAG- | 2 | 136 | PAG- | 0 | 82 |
1.Start from the partially completed Question 1.R script located in Exercise 5 folder. Build a model for two diagnostic tests, one single population, but with Se and Sp of the PAG test varying as function of days since breeding. Use the Romano et al. (2006) results, to inform the priors for the Se and Sp of a US exam (Se: 0.90, 95CI: 0.85, 0.95; Sp: 0.95, 95CI: 0.90, 0.97). Assume conditional independence between tests. How many degrees of freedom do you have and how many unknown parameters do you need to estimate? Based on your results, would you conclude that PAG accuracy is influenced by number of days since breeding?
1. How many degrees of freedom do you have and how many unknown parameters do you need to estimate? Based on your results, would you conclude that PAG accuracy is influenced by number of days since breeding?
Answer: We would have 6 degrees of freedom and 8 unknown parameters to estimate (one Se and one Sp for US, 2 Se and 2 Sp for PAG, and two prevalence). We need two informative priors to make the LCM identifiable.
The results from this LCM are below.
Table. Results for the LCM allowing for varying accuracy of PAG as function of days since breeding.
| Parameter | Median | 95CI |
|---|---|---|
| Prevalence 28-35 days | 0.521 | 0.462, 0.580 |
| Prevalence 36-45 days | 0.584 | 0.512, 0.653 |
| Se US | 0.928 | 0.895, 0.956 |
| Sp US | 0.977 | 0.957, 0.990 |
| Se PAG 28-35 days | 0.991 | 0.963, 1.00 |
| Se PAG 36-45 days | 0.994 | 0.966, 1.00 |
| Sp PAG 28-35 days | 0.980 | 0.927, 0.999 |
| Sp PAG 36-45 days | 0.962 | 0.879, 0.998 |
There was little difference between the 28-35 days since breeding vs. 36-45 days since breeding accuracy parameters. Below, we have plotted the two PAG Sp posterior distributions to illustrate how similar they are. However, since we are not in a Frequentist framework, “testing” whether the 28-35 days vs. 36-45 days accuracy parameters are statistically different is not straightforward.
Posterior distributions for PAG specificity at 28-35 days (black) and 36-45 days (red) post-breeding.
In this document we have described some of the “main” LCM. More specifically we described LCMs that can be used with:
-Two conditionally independent
diagnostic tests applied in one single population
-Two conditionally dependent
diagnostic tests applied in one single population
-Two conditionally independent
diagnostic tests applied in more than one
population
-Two conditionally dependent
diagnostic tests applied in more than one
population
-More than two conditionally
independent diagnostic tests applied in one
single population
-More than two diagnostic tests applied in one
single population and with conditional
dependence between two of the tests
-Two conditionally independent
diagnostic tests, but with accuracy of one or both tests varying
as function of a characteristic of the individual tested
From here, it is up to you to adapt these LCMs to fit your needs. For instance, you could have a study where three tests were applied to one population, but two of these tests were also applied to a second population. For two of the tests, you may want to relax the assumption of conditional dependence by including covariance terms between them. Finally, you may want to allow the Se and Sp of one of the test to vary as function of a characteristic of the individual tested.
When working with a “customized” LCM, the principles that we have discussed still apply. It will always be a good starting point to:
-Count the number of unknown parameters to estimate
-Organize the dataset(s)
-Evaluate whether some tests combinations are not observed
-Count the number of degrees of freedom available
-Identify the number of informative priors that will be needed
-Consider whether scientific literature will be available to develop the
needed informative priors
Based on this initial assessment, you will be able to decide whether running this LCM is doable, or if it would be better to revise your plans.
An important step of any Bayesian analysis is to evaluate whether the results are strongly affected by the choice of priors and/or to the assumptions made.
Regarding the sensitivity of our results to the assumptions made, this could be assessed by comparing the results from the LCM where a given assumption was made vs. the ones from a LCM where the assumption was relaxed. Of course, to fully “isolate” the effect of a given assumption, the same priors would have to be used in both LCM. This is, however, not always possible since the “relaxed” LCM will often need a larger number of informative priors to make it identifiable.
In the end, providing the results from an “alternative” LCM could help the readers understand what was the influence of the model chosen on the presented results. As author, you would probably want to use the LCM that is the most “biologically relevant” as your main model.
Whenever informative priors are used in a LCM, the sensitivity of the LCM to the choice of priors should be assessed. This can be done by running the same LCM, but with “perturbed” and usually “more diffuse” priors. Johnson et al. (2019) recommended, for instance, to move the mode of the prior distribution of a given parameter by a few percentage-points and to increase the width of its prior distribution. Then, the results from the LCM using the scientific literature-derived priors can be compared to those of the LCM using perturbed priors to assess how sensitive to the choice of priors the results are. The amount and direction by which the mode will be moved and the level of increase of the width of the distribution will depend on the problem at hand.
For instance, in the preceding sections we used the Romano
et al. (2006) results, to inform the priors for the Se and Sp of a
US exam conducted between 28-45 days post-insemination:
-Se = 0.90 (95CI: 0.85, 0.95)
-Sp = 0.95 (95CI: 0.90, 0.97)
We thus came up with these prior distributions for US’ Se and sp.
Thus, a possibility could be to move down the Se and Sp modes by 10 percentage points (i.e., mode for Se:0.80 and mode for Sp:0.85) and to increase the with of the distribution by moving the 2.5th percentiles by 30 percentage-points (Se 2.5th percentile: 0.55; Sp 2.5th percentile: 0.60). Thus:
Then, the results from the LCM with the informative and perturbed priors could be compared.
For instance, in exercise 3 we compared two conditionally independent diagnostic tests (PAG and US) in three populations (herd #1, herd #2, and herd #3). For that model, though the model is identifiable, we could use the Romano et al. (2006) informative priors on US’ Se and Sp. Then, we could run the same LCM, but with the perturbed US’ Se and Sp priors suggested above. If we do that, we would obtain the following results:
Table. Results from a 2 tests, 3 populations LCM using different priors (literature-based vs. perturbed informative priors on US’ Se and Sp). The priors differing between LCM are indicated in bold.
| Parameter | Romano et al. 2006 priors | Perturbed priors | |||
|---|---|---|---|---|---|
| Prior | Posterior | Prior | Posterior | ||
| Prev1 | Beta(1.0, 1.0) | 0.558 (0.496, 0.618) | Beta (1.0, 1.0) | 0.556 (0.494, 0.617) | |
| Prev2 | Beta(1.0, 1.0) | 0.523 (0.440, 0.604) | Beta (1.0, 1.0) | 0.520 (0.435, 0.603) | |
| Prev3 | Beta(1.0, 1.0) | 0.559 (0.452, 0.662) | Beta (1.0, 1.0) | 0.557 (0.449, 0.657) | |
| Se US | Beta (100, 12) | 0.925 (0.893, 0.954) | Beta (13.6, 4.1) | 0.931 (0.890, 0.970) | |
| Sp US | Beta (100, 6.2) | 0.977 (0.956, 0.990) | Beta (14.0, 3.3) | 0.981 (0.957, 0.994) | |
| Se PAG | Beta (1.0, 1.0) | 0.996 (0.981, 1.00) | Beta (1.0, 1.0) | 0.996 (0.980, 1.00) | |
| Sp PAG | Beta (1.0, 1.0) | 0.981 (0.934, 0.999) | Beta (1.0, 1.0) | 0.977 (0.922, 0.999) |
In this specific example, we can see that the results are virtually unchanged. Thus, we could say that our initial analysis was quite robust to the choice of prior distributions. Personally, we would present the LCM with the Romano et al. (2006) priors as our “main” model. If we do have valid information from the literature about one or many of the unknown parameters, then why not using them?